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Abstract 

Othello  is  a  recent  addition  to  the  collection  of  games  that  have  been  examined  within  artificial  intelligence. 
Advances  have  been  rapid,  yielding  programs  that  have  reached  the  level  of  world-championship  play.  'litis 
article  describes  the  current  champion  Othello  program,  logo.  The  work  described  +te«r’mcludos:  (1)  a  task 
analysis  of  Othello;  (2)  the  implcmcnation  of  a  program  based  on  this  analysis  and  state-of-the-art  Al  game- 
playing  techniques;  and  (3)  an  evaluation  of  the  program’s  performance  through  games  played  against  other 
programs  and  comparisons  with  expert  human  play. 
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1.  Introduction 

The  game  *»f  Othello1  is  a  modern  variation  of  the  I9,h  century  hoard  game  Reversi.  It  was  developed  in  its 
current  form  in  1974  by  Goro  Hascgawa  (sec  Personal  Computing  (1980)  foi  more  details),  and  op  until 
recently  it  has  been  played  primarily  in  Japan,  where  it  is  the  second  most  popular  board  game  (next  to  Go). 
Othello  is  similar  to  backgammon  in  its  appeal,  the  rules  are  simple  an  enjoyable  game  can  he  played  by 
beginners,  and  there  is  a  great  deal  of  complexity  dial  must  be  dealt  w  ith  in  order  to  play  the  game  well.  One 
of  the  purposes  of  the  present  article  is  to  face  this  complexity  by  treating  the  game  of  Othello  as  a  subject  for 
scientific  analysis.  The  primary  purpose  of  this  article  is  to  show  how  state-of-the-art  game-playing 
techniques  can  be  applied  to  Othello  to  yield  a  competent  performance  program,  called  logo.  In  its  most 
recent  competition,  logo  won  (he  Santa  Cru/  Open  Machine  Othello  Tournament7,  with  an  8  0  record 

against  an  international  field  of  computer  Othello  programs.  In  his  review  of  the  tournament.  Jonathan  Cerf, 
the  current  world  Othello  champion,  suited  JCcrfglbJ:  "In  my  opinion  the  top  programs  from  Santa  Cru/  arc 
now  equal  (if  not  superior)  to  the  best  human  players." 

The  study  of  game  playing  is  one  of  the  oldest  and  most  developed  portions  of  artificial  intelligence.  I  larly 
work  on  checkers  [Samuel  63]  showed  the  potential  of  Al  techniques  in  game  playing.  More  recently,  the 
field  has  been  driven  primarily  by  advances  in  chess  [Slate  &  Atkin  77],  which  have  led  to  programs  that  are 
fast  approaching  the  master  level  of  play.  Work  on  other  games  has  continued  to  broaden  the  range  of 
application  of  game-playing  methods.  Kor  example,  the  work  by  Berliner  on  backgammon  has  produced  a 
program  that  won  a  match  against  the  world  backgammon  champion,  as  well  as  contributing  new  methods 
and  theory  to  the  field  [Berliner  80], 

The  success  of  logo  is  additional  evidence  of  the  power  of  AI  game-playing  techniques.  When  I  began  this 
effort,  my  knowledge  of  both  Othello  and  game-playing  techniques  was  rudimentary.  My  total  experience 
with  Othello  consisted  of  about  ten  games  against  a  very  poor  program  (which  played  the  maximum  disc 
strategy  (Section  3.1.1)).  Since  then,  with  about  5  man-months  of  effort,  logo  has  been  brought  up  to  world- 
championship  level. 

logo  routinely  beats  all  human  players  around  the  CMU  computer  science  department  (including  the 
author).  Because  programs  do  not  currently  participate  in  human  Othello  tournaments,  logo  has  only 
competed  in  two  tournaments  organized  explicitly  for  Othello  programs.  The  first  --  the  Northwestern  Man- 
Machine  Othello  Tournament  -  look  place  in  June  1980.  There  were  eight  invited  competitors,  including  the 


'oihcllo  is  CBS  Inc.'s  registered  trademark  for  its  strategy  disc  game  and  equipment.  Gameboard  design  (c)  1974  CBS  Inc. 
7 A  description  of  the  tournament  and  the  final  standings  can  be  found  in  Frey  (1981a). 
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Ihcn  world  Othello  champion,  Hiroshi  Inouc,  the  U.S.  Othello  champion,  Jonathan  Cerf,  and  an  early  version 
of  /ago.  That  version  of  logo  w  as  written  in  the  Sail  language,  and  was  running  on  CM  U- 101)  (a  1  XxSystcm 
1050  -•  KAIO  processor  with  240K  words  of  primary  memory).  The  other  five  competitors  were  Othello 
programs,  logo  came  in  fifth  --  losing  to  both  human  entries,  and  ending  up  3-1-1  versus  the  other  programs. 
It  is  noteworthy  f  at  each  of  the  humans  lost  to  a  program  in  this  tournament. 

Ihc  second  tournament  --  the  Santa  Cruz  Open  Machine  Othello  Tournament  -  was  held  in  January  1981. 
Ihis  tournament  was  open  to  any  computer  Othello  program,  and  had  no  human  contestants.  It  drew  an 
international  field  of  twenty  programs,  including  the  top  entries  from  the  previous  tournament.  The 
tournament  was  eight  rounds,  with  the  pairings  being  made  so  that  programs  with  comparable  records  wcie 
paired  (with  duplications  eliminated).  An  improved  version  of  /ago  (still  written  in  Sail,  but  now  running  on 
CMU-20C,  a  DeeSystem  2060  --  KI.10  processor  with  512K  words  of  primary  memory)  earned  a  perfect  8-0 
record,  including  wins  against  the  top  four  challengers  (and  seven  out  of  the  top  nine  challengers). 

It  is  difficult  to  compare  the  structure  of  / ago  with  that  of  the  other  leading  Othello  programs,  because 
there  is  a  reluctance  on  the  part  of  most  of  the  authors  to  release  information3.  This  stems  from  a  desire  to 
maintain  a  competitive  edge  (and  for  many,  a  marketing  edge  as  well).  However,  one  thing  is  clear  -- 
computing  power  has  not  been  the  primary  determinant  of  success  in  computer  Othello.  I"he  top  two 
programs  in  the  Northwestern  tournament,  and  the  second  through  seventh  finishers  in  the  Santa  Cruz 
tournament  were  running  on  microcomputers.  It  is  the  structure  and  knowledge  contained  in  die  current 
version  of  logo  that  allow  it  to  perform  at  a  world-championship  level. 

The  topics  covered  in  this  paper  include  a  description  of  the  game  of  Othello  and  its  rules  (Section  2);  an 
analysis  of  Othello  (Section  3);  the  structure  and  mechanisms  in  /ago  (Section  4);  and  a  comparison  of  logo's 
play  with  expert  human  play  (Section  5).  Section  6  contains  some  concluding  remarks. 


JFor  what  has  been  written,  see  Frey  (1980a,  1980b),  Maggs  (1979).  and  Phillips  (1980)  A  brief  description  of  AUtron  (wriuen  by 
Charles  Meath),  the  second  place  program  in  the  Santa  Cru?  Open  Computer  Othello  Tournament,  appears  in  Cerf  (1981b)  (along  with  a 
brief  description  of  logo)  To  the  level  at  which  it  is  described,  AUmro*  appears  to  employ  information  not  too  different  from  that  in 

Mf* 
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2.  The  Game  of  Othello 

Conceptually  Othello  is  a  derivative  of  the  Go  family  of  board  games:  emphasizing  the  capture  of  territory 
through  the  process  of  surrounding  die  opponent's  pieces.  It  is  played  on  an  8x8  board,  with  a  set  of  dual- 
colored  discs.  Kach  disc  is  black  on  one  side  and  white  on  the  other.  The  initial  board  configuration  is  shown 
in  Figure  2- la.  Othello  notation  differs  from  standard  Chess  notation  in  numbering  the  rows  from  top  to 
bottom,  so  the  top  left-hand  corner  square  is  al.  In  addition,  many  of  the  squares  have  standard  names 
(Figure  2-  lb),  such  as  square  b2  which  is  an  x-square 4. 

Initially,  white  owns  the  two  central  squares  on  the  main  diagonal  (d4  and  e5),  and  black  owns  the  two 
central  squares  on  the  minor  diagonal  (c4  and  d5).  Black  plays  first,  and  then  the  players  take  turns  moving 
until  neither  side  has  a  legal  move.  The  player  with  the  most  discs  at  this  point  is  declared  the  winner  (there 
may  be  tics).  In  Othello,  a  player  controls  only  those  squares  that  have  a  disc  with  his  color  on  them. 
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(b)  The  named  board  squares 
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Figure  2-1:  The  Othello  board 

A  move  is  made  by  placing  a  disc  on  the  board,  with  the  player’s  color  facing  up.  In  order  for  a  move  to  be 
legal ,  the  square  must  be  empty  prior  ;o  play,  and  placing  the  disc  must  capture  some  of  the  opponent's  discs 
by  flipping  them  to  the  player’s  color.  Figure  2-2  shows  an  example  of  a  legal  move  by  black  to  square  d4,  and 
the  resulting  changes  in  disc  ownership.  The  board  position  in  Figure  2- 2a  could  not  possibly  have  arisen 
during  a  game;  it  is  intended  solely  to  illustrate  a  legal  move. 

The  opponent’s  discs  arc  flipped  by  bracketing  them  between  the  disc  being  played  and  an  existing  disc 
belonging  to  the  player.  Bracketing  can  occur  in  a  straight  line  in  any  of  the  eight  directions  (two  vertical,  two 
horizontal,  and  two  in  each  diagonal  direction),  and  consists  of  an  arrangement  of  the  form:  the  empty  square. 


4Sec  Sullivan  and  Richards  (1981)  for  a  glossary  of  this  and  other  Othello  terms. 
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(a)  Before  black's  move  at  d4 


(b)  After  black’s  move  at  d4 


Figure  2-2:  Contrived  example  to  show  results  of  a  legal  play. 

The  marker  (the  solid  black  square)  in  (a)  points  to  where  the  disc  will  be  played. 

followed  immediately  by  one  or  more  of  the  opponent's  discs,  followed  immediately  by  one  of  the  player’s 
discs.  This  arrangement  can  be  found  in  two  directions  from  d4:  horizontally  to  the  right  (flipping  the  discs  at 
e4,f4,  and  g4)\  and  on  the  lower  right  diagonal  (flipping  the  disc  at  e5).  There  can  be  no  empty  squares  within 
this  string,  so  the  disc  at  d5  does  not  flip.  Discovered  bracketing  docs  not  cause  discs  to  be  flipped;  even 
though  the  disc  at  square  e4  is  flipped,  causing  the  disc  at  eJ  to  be  bracketed,  the  disc  at  eS  is  not  flipped.  All 
of  the  opponent’s  discs  that  arc  bracketed  by  the  new  disc  arc  flipped  to  the  player’s  color.  No  discs  are  ever 
taken  off  the  board  -  they  just  change  color. 

If  the  player  docs  not  have  a  legal  move,  he  loses  his  turn  and  the  opponent  plays  again.  Usually  the  game 
ends  only  after  the  entire  board  has  been  filled  with  pieces,  which  normally  takes  64  -  4  =  60  moves. 
However,  games  can  terminate  before  this  occurs  when  neither  side  is  able  to  move.  In  the  Santa  Cruz 
tournament,  seven  games  (out  of  the  77  games  for  which  a  record  is  available  [Frey  81b],  excluding  all  types  of 
forfeits)  ended  before  the  board  was  filled.  Two  games  ended  because  of  h  ipe-uuis.  56-0  and  42-0,  and  the 
other  five  ended  with  an  average  of  1.4  empty  squares.  In  tournaments,  games  can  also  end  when  a  player 
forfeits  because  of  lack  of  time.  Each  side  is  allowed  30  minutes  in  which  to  make  all  of  their  moves.  This 
yields  an  average  of  one  minute  per  move,  assuming  that  each  side  will  make  half  of  the  moves. 

The  average  winning  margin  at  the  Santa  Cruz  Open  was  27.9  discs  (standard  deviation  of  15.7).  Figure  2-3 
shows  how  the  margin  was  distributed.  Of  the  77  games,  black  won  36  times  (average  margin  of  30.6  discs), 
and  white  won  41  times  (average  margin  of  25.5  discs). 
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3.  An  Analysis  of  Othello 

The  basis  fur  .1  proarm’  a  task  analysis.  Due  to  its  newness.  ()ihell<  sir.iu-gy  is  siill  in  an 

emh;_,cnm.  stage  of  development,  and  useful  descriptions  of  llie  game  are  lacking.  Ihere  is  one  Imglish- 
language  book  on  Othello  |llascgawa  Sl  Brud>  77).  bin  us  advice  is  more  misleading  than  helpful.  Ihc 
analysis  m  this  section  is  limited  to  what  the  author  and  cowoikeis  (prim. ml-.  Hi  me  I  ulend.nf  and  Charles 
I  ciserson)  have  been  able  to  ascertain  about  the  game5. 

3.1 .  Three  simple  strategies 

A  number  of  simple  strategies,  such  as  the  obv  ious  one  of  maximizing  the  numericjl  disc  advantage,  have 
been  suggested  for  Othello.  While  none  of  these  strategics  can  produce  a  competent  performance  program, 
their  examination  docs  lead  to  concepts  that  are  useful  in  a  more  detailed  task  analysis.  In  this  section  we 
examine  three  single-concept  strategics:  (1)  tire  obvious  one,  called  the  maximum  disc  strategy;  t2)  the 
weighted  square  strategy;  and  (3)  the  minimum  disc  strategy. 

3.1 .1 .  The  maximum  disc  strategy 

The  maximum  disc  strategy  derives  from  a  straightforward  attempt  to  use  the  top-level  goal  of  the  game 
(possess  the  most  discs  at  the  end  of  the  game)  as  a  playing  heuristic.  As  such,  it  is  a  form  of  gm  d\  algorithm 
The  goal  gets  translated  into  a  hce  istic  of  the  follow  ing  form. 

Play  the  move  which  captures  the  most  discs 

As  an  evaluation  function  for  a  program,  this  heuristic  is  generally  implemented  by  maxtnu/mg  the  difference 
in  the  number  of  discs  for  each  side.  This  will  be  positive  when  the  player  has  more  discs,  negative  if  the 
opponent  has  the  majority,  and  /.cro  for  a  tic.  One  of  the  intriguing  aspects  of  Othello  is  that  tins  obvious 
strategy  is  a  miserable  failure.  Just  how  wrong  this  strategy  can  be  is  well  documented  by  .1  game  played  at 
the  Santa  Cruz  Open  by  lago  against  The  .tf<w(wnlicn  by  David  1  cvy  and  Kevin  O'Connell),  figure  3-la 
shows  the  state  of  the  game  after  move  30  (Black  to  play),  lago  was  playing  black  and  The  Moor  w. is  playing 
white  At  that  point  The  Moor  had  a  30  disc  advantage  (32  to  2)  Its  problems  are  twofold:  ( 1 )  white  has  the 
most  discs,  but  nothing  prevents  black  from  recapturing  them;  and  (2)  white  has  completely  lost  its  mobility. 
In  its  most  general  sense,  mobility  refers  to  the  amount  of  freedom  a  player  has  in  the  selection  of  moves.  In 
this  case,  black  should  be  able  to  force  white  to  move  wherever  he  pleases  lhat  is  precisely  what  happened; 
leading  to  a  final  score  of  51-13  in  favor  of  lago{\ 'igure  3-lb). 

Figure  3-lc  shows  the  record  of  the  game  in  Othello  notation  lhc  numbered  discs  show  when,  and  by 
whom,  a  disc  was  played  in  the  associated  square,  l  or  example,  the  first  move  was  made  bv  Mack  at  df  so 


le  author  is  not  a  serious  Othello  plavcr  and  cvccpt  hncflv  ai  tournaments  espen  plavcrv  have  not  been  available  for  consultation 
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there  is  a  black  disc,  with  die  digit  1  superimposed  on  it  at  J3.  Kach  square  in  the  record  has  a  colored  disc 
and  a  move  number  (between  1  and  60).  Figure  3-2  tracks  die  disc  differential  over  the  whole  game.  Though 
The  Moor  was  probably  not  playing  a  pure  maximum  disc  strategy,  its  strategy  was  greedy  enough  to  show 
that  the  maximum  disc  strategy  is  poor. 
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(a)  After  the  30th  move 
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(b)  The  final  position 
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Figure  3-1:  logo  (B-51)  vs.  The  A/oor(W-13) 
Santa  Cruz  Open  Machine  Othello  Tournament 


Figure  3-2:  The  disc  counts  and  differential 
/ago (B-51)  vs.  TheMoor( W-13) 

Santa  Cruz  Open  Machine  Othello  Tournament 
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3.1.2.  The  weighted  square  strategy 

The  weighted  sqtutre  strategy  stems  from  the  observation  that  not  all  of  the  squares  on  the  Othello  board  are 
of  equal  value.  For  example,  compare  the  corner  squares  (at.  hi.  ah.  hx)  with  the  x -squares  (recall  figure  2- 
lb).  The  corner  squares,  once  occupied,  cannot  be  recaptured,  w  hile  ownership  of  an  x  square  almost  always 
allows  the  opponent  to  capture  the  adjacent  corner.  Ihe  corner  square  thus  seems  a  more  valuable  square. 

The  maximum  disc  strategy  ignores  these  distinctions:  it  just  computes  the  numerical  difference  in  disc 
ownership.  With  the  weighted  square  strategy,  a  sum  is  computed  in  which  each  disc  is  weighted  according  to 
the  value  of  the  square  that  it  occupies.  The  difference  between  the  two  weighted  sums  (one  for  each  player) 
is  the  important  statistic.  The  square  values  can  be  constants,  or  can  change  dynamically  with  the  game 
situation,  depending  on  the  sophistication  of  the  implementation. 


The  version  of  /ago  that  competed  in  the  Northwestern  Man- Machine  Othello  Tournament  played  a 
weighted  square  strategy  (sec  Maggs  (1979)  for  a  description  of  a  typical  weighted-square  program).  Its 
performance  in  that  tournament  revealed  the  inadequacy  of  the  strategy,  /ago  was  soundly  beaten  by  both 
human  entrants,  /ago s  game  against  Hiroshi  lnoue  (won  by  lnouc  43-21)  can  be  seen  in  Figure  3-3a.  Ihe 
weighted  square  strategy  fails  because  there  are  more  important  reasons  for  taking  (or  avoiding)  squares  than 
just  their  location  on  the  board.  For  instance,  mobility  is  crucial.  Starting  from  the  position  in  Figure  3-3b, 
lnoue  was  able  to  make  six  consecutive  moves:  c8 ,  h8.  g7.  a7,  a/,  and  u2.  /ago  had  no  legal  moves  during  that 
entire  span.  The  final  board  position  can  be  seen  in  Figure  3-3c. 
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I  he  weighted  square  strategy  is  not  all  had.  Against  the  otliei  live  ptogr.um  in  the  Northwestern 
tournament,  logo  ended  up  a  respectable  3-1-1.  the  strategy  is  not.  however,  sufficient  to  play  Othello  at  a 
high  level  of  skilL 

3.1.3.  Thw  minimum  dine  itnltgy 

Ihe  remaining  simple  strategy  that  is  worth  examining  is  the  minimum  disc  siiatcgy.  Both  the  maximum 
disc  and  weighted  square  strategies  ran  into  mobility  problems  because  of  their  greediness.  Ihe  minimum  Jix 
strategy  attempts  to  alleviate  this  h>  taking  the  opposite  tack.  A  player  using  the  minimum  disc  strategy 
minimizes  the  number  of  his  own  discs,  while  maximi/ing  the  number  of  his  opponent's  discs. 

I  have  no  direct  experience  with  this  strategy.  It  is  an  approximation  to  the  mobility  measures  that  will  be 
discussed  shortly,  but  its  emphasis  is  wrong.  Where  the  discs  are  placed  is  as  crucial  as  the  simple  count  of 
how  many  there  are.  Having  only  a  few  poorly  placed  discs  can  be  disastrous,  as  the  position  in  Figure  3-4 
shows.  Black  has  fewer  discs  (7  to  39),  but  only  two  moves  --  both  of  which  allow  white  to  lake  comer 
squares.  This  example  is  from  Stringham  ( 1980).  which  contains  a  discussion  of  the  disc  counting  strategies. 
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Figure  >4:  A  board  position  that  is  misjudged  by  the  minimum  disc  strategy  -  Black  to  play. 
This  position  is  taken  from  Stringham  (1980) 
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3.2. 0«n«ral  analysis 

In  *  game  of  Othello,  the  players  must  maintain  two  top-level  gtuls  for  then  play  One  lop-feve!  goal  n 
obvious,  to  win  the  game.  The  other  top-level  goal,  maximizing  the  winning  margin,  is  important  during 
tournaments  in  which  total  disc  differential  is  used  as  a  lie  breaking  mechanism  I  h*.  differential  is  also  the 
primary  factor  in  the  official  ratings  of  the  United  Suites  Othello  Assocmihni  (l  SO  N)  A  player's  rating  will 
be  lowered,  even  if  he  wins,  if  he  does  not  heat  hts  opponent  by  an  amount  determined  h>  their  relative 
ratings*. 

As  with  many  other  games,  Othello  games  can  be  temporally  subdivided  into  three  phases:  ( I )  the  opening 
game;  (2)  the  middle  game;  and  (3)  the  end  game,  there  arc  no  firm  boundaries  between  the  phases  Rather, 
they  are  characterized  by  their  different  strategic  concerns,  in  the  remainder  of  this  section  we  examine  the 
three  phases  of  an  Othello  game  in  reverse  chronological  order.  The  analyses  of  the  early  phases  of  the  game 
are  motivated  by  what  comes  later  in  the  game. 

3.2. 1 .  Thw  snd  gams 

As  the  end  of  the  game  nears,  the  primary  concern  is  with  maximizing  the  disc  differential  However,  as 
was  shown  by  thj  failure  of  the  maximum  disc  strategy,  flipping  a  disc  does  little  good  if  the  opponent  can 
immediately  flip  it  back.  The  concern  must  be  with  maximizing  the  final  disc  differential. 

3.2.1 .1 .  Solving  and-gam*  positions 

One  way  to  maximize  the  final  disc  differential  is  by  solving  the  position;  that  is.  to  .completely  search  the 
game  tree  to  the  end  of  the  game.  In  Othello,  the  branching  factor  of  the  game  tree  is  always  bounded  above 
by  the  number  of  empty  squares  on  the  board,  which  diminishes  as  the  end  approaches.  Figure  .1-5  shows  the 
average  number  of  moves  available  at  each  point  in  the  game.  The  data  »  averaged  over  the  ten  games  played 
by  l»go  at  the  Santa  Cruz  Open  (includes  two  unofficial  games  also  won  by  Imgo),  using  the  values  for  both 
players.  Pairs  of  adjacent  moves  (the  first  move  for  each  player  (moves  1  and  2).  the  second  move  for  each 
player  (moves  3  and  4),  etc.)  were  then  averaged  to  yield  the  points  in  Figure  3-5.  Because  of  the  small 
branching  factor  near  the  end,  complete  analysis  of  final  move  sequences  plays  an  important  part  in  successful 
end  game  play. 


*The  ratings  are  defined  so  that  every  ten  points  difference  between  two  players  means  one  disc  difference  in  tcore  For  example,  a 
lame  between  two  players  whose  rankings  differ  by  200  points  (the  interval  between  classes  of  players)  should  end  with  a  disc  differential 
of  20  (a  final  score  of  42-22)  Ratings  below  R00  points  arc  in  the  novice  class:  over  1999  points  b  a  senior  master  The  highest  ranking 
(1796  -  high  expert)  currently  belongs  to  Jonathan  Cerf  For  more  on  the  ratings  system  see  Richards  (1981). 
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Figure  3-5:  Mean  number  of  legal  moves  at  each  point  of  an  Othello  game. 

Computed  from  the  ten  games  played  by  logo  at  the  Santa  Cat?.  Open 

3.2.1 .2.  Stability 

Prior  to  when  complete  analysis  is  possible,  possession  of  final  discs  can  be  guaranteed  through  the 
acquisition  of  stable  discs.  A  completely  stable  disc  can  never  be  recaptured  (flipped)  by  the  opponent. 
Therefore,  the  number  of  stable  discs  for  each  player  is  a  monotonieally  increasing  function  of  move  number. 
Figure  3-6  shows  the  number  of  stable  discs  for  the  game  at  the  Santa  Cruz  Open  between  logo  and  Reversi 
Challenger  (written  by  Dan  and  Kathc  Spracklen).  The  game  (won  by  logo  by  a  score  of  60-3)  can  be  seen  in 
Figure  3-7. 

Four  lines  run  through  each  square  on  the  Othello  board  (one  horizontal,  one  vertical,  and  two  diagonals). 
Two  necessary  (but  not  sufficient)  conditions  for  flipping  a  disc  are  that  there  be  one  of  the  player’s  discs  on 
one  side  of  the  desired  disc  (on  one  of  the  four  lines  through  the  square),  and  a  blank  square  on  the  other  side 
(along  the  same  line).  Neither  condition  requires  immediate  adjacency  (there  may  be  intervening  discs  of  the 
same  color  as  the  one  being  flipped),  but  this  bracketing  along  one  of  the  lines  must  be  possible.  Therefore,  a 
disc  is  provably  stable  if  there  are  no  lines  through  the  square  along  which  both  of  these  conditions  can  ever 
hold. 

The  Othello  board  can  be  divided  into  three  classes  of  squares:  (1)  corner  squares  (a l,  hi,  a8 ,  h8)\  (2)  edge 
squares  (the  a-squarcs,  b-squares,  and  c-squares  in  Figure  2- lb)  ;  and  (3)  internal  squares  (the  remaining 
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Figure  3-6:  The  number  of  stable  discs 
Reversi  Challenger  (B-3)  vs.  /ago  (W-60) 
Santa  Cruz  Open  Computer  Othello  Tournament 
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(a)  The  game  record 


(b)  The  final  position 


Figure  >7:  Reversi  Challenger  (B-3)  vs.  lago  (W-60) 

Santa  Cruz  Open  Machine  Othello  Tournament 

squares).  Discs  in  corner  squares  are  always  completely  stab* .  At  least  one  side  of  each  line  through  a  comer 
square  lies  off  the  playing  surface,  inaccessible  for  disc  placement.  Comer  discs  arc  crucial,  not  only  because 
of  their  own  stability,  but  because  they  commonly  form  an  anchor  by  which  other  discs  can  be  stabilized. 
Edge  discs  are  the  first  to  show  this  effect  Three  of  the  four  lines  through  each  edge  square  lie  off  the  board 


I  AN  \NAl  VMS  Ol  01  Ill'll  O 


II 


«n  one  side,  leaving  only  one  line  of  possible  instability  -  the  edge  itself.  ( )ne  way  of  lemov  ing  iliis  instability 
is  to  have  the  edge  dise  be  immediately  adjacent  to  a  stable  disc  of  the  same  color,  such  as  a  corner  disc.  If 
there  is  no  way  tli.it  the  stable  disc  can  be  flipped  along  the  edge,  the  same  must  be  tine  of  an  adjacent  edge 
disc  of  die  same  color.  In  general,  there  arc  three  types  of  formations  that  stabilize  edge  discs. 

•  If  the  edge  disc  is  adjacent  to  a  stable  edge  (or  corner)  disc  of  the  same  coloi  ibcn  it  is  stable,  l  or 
example,  white’s  discs  at  bl  and  cl  in  Figure  3-8a. 

•  If  the  edge  disc  is  part  of  a  string  of  the  player’s  discs  embedded  between  two  of  the  opponent’s 
stable  discs,  then  it  is  stable,  l  ire  embedded  dise  can  also  be  stable  even  when  the  embedding 
discs  arc  not;  though  black’s  disc  at  h4  is  unstable  (Figure  3-Ra),  white’s  discs  at  ItS  and  h(>  arc 
stable.  All  of  the  empty  squares  in  the  top  portion  of  that  edge  arc  needed  just  to  tl  ip  black’s  disc, 
leaving  no  possibility  of  flipping  white’s  discs. 

•  If  there  arc  no  empty  squares  on  that  edge  (including  corners)  then  all  of  the  discs  on  that  edge  are 
stable.  For  example,  all  of  tire  discs  on  the  bottom  edge  of  die  board  in  Figure  3-8a  are  stable. 


In  addition  to  these  stable  formations,  it  is  possible  (though  rare)  for  an  edge  disc  to  be  stable  simply 
because  it  is  impossible  for  the  opponent  to  make  the  edge  move  that  would  result  in  the  disc  being  Hipped. 
In  Figure  3-8b,  the  black  discs  at  a}-a6  are  stable  because  it  is  impossible  for  white  discs  to  be  played  at  o2 
and  a7. 
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(a)  bl-cl ,  Ii5-h7,  and  bX-gS  arc  stable. 
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Figure  3-8:  Contrived  examples  of  edge  stability. 

Internal  discs  require  stability  along  all  four  lines  through  them,  and  arc  thus  difficult  to  stabilize  until  very 
late  in  the  game.  The  maximum  disc  strategy  ignores  this  fact  (among  others),  and  plays  poorly  .  On  die  other 
hand,  the  weighted  square  strategy  is  a  first-order  attempt  at  handling  the  stability  distinctions  between  the 
regions  of  the  board,  with  weights  diminishing  from  corners  to  edges  to  internal  squares.  Of  course,  this 
approach  is  too  simplistic:  the  impact  of  owning  a  disc  can  extend  far  beyond  the  square  occupied  by  the  disc. 
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I -'or  example,  placing  a  disc  in  a  corner  square  may  allow  a  -Ntiing  ol  discs  along  one  side  to  l-e  .labih/cd  (along 
with  stabilizing  the  corner  square  itself),  while  wreaking  havoc  alone  other  sides  ol  the  ho.ud.  f  igure  3  9a 
shows  a  typical  such  situation,  from  the  first  game  of  the  19X1)  world-championship  match  between  Jonathan 
Cerf  (U.S.A)  and  Takuya  Minima  (Japan).  White  (Minima)  h  is  an  unbalanced  foimuiion  |  lacobs  &.  Jacobs 
79|  along  the  top  edge  of  the  board  --  a  string  ol  five  discs  on  the  side  with  the  uutieis  ,md  one  of  the  c- 
squarcs  empty.  A  move  to  hi  by  white  stabilizes  white  s  discs  along  the  right  side  ot  the  ho.ud.  hut  allows 
black  to  stabilize  the  top  and  left  sides.  In  addition,  black  gets  the  corner  at  aX.  giving  Inm  a  leg  up  on  the 
bottom  edge  of  the  board.  I  he  results  can  he  seen  in  die  final  board  position  (figure  3-91*).  I  he  game  record 
is  in  figure  3-9c. 

a  t>  c  d  e  I  fl  h  a  b  c  d  e  (  8  h  a  b  c  d  e  1  8  h 


mfluRiw  a  jink.  jk.  J 

hkkn:  :®m:  : 

®m:  m:  zmmm 

:: : 

MCCIll 


(a)  White  to  play  (b)  The  final  position  (c)  I  he  game  record 

( Before  Move  50) 


Figure  3-9:  A  case  where  the  corner  move  (50- 111)  leads  to  had  conscqucrccs 
first  game  of  the  19X0  World  Championship  match 
Ccrf(B-44)  vs.  Mnnura  (W-20) 

Strategics  for  occupying  and  avoiding  specific  squares  become  a  major  component  of  play  when  the  effect 
of  a  single  disc  is  critical.  A  player  can  have  one  (or  more)  of  the  following  goals  for  a  square:  ( 1 )  occupy  the 
square  himself;  (2)  avoid  the  squaie;  (3)  have  the  opponent  occupy  the  square;  and  (4)  keep  the  opponent  out 
of  the  square.  Fach  type  of  goal  requires  a  different  strategy,  hut  they  arc  all  based  on  a  common  set  of  four 
components: 

•  Work  on  the  complementary  goal,  for  example,  if  a  specific  square  is  desired,  the  complementary 
goal  of  keeping  the  opponent  out  of  the  square  should  also  be  pursued. 

•  Work  on  ownership  of  the  adjacent  squares.  If  the  player  has  discs  in  the  squares  adjacent  to  the 
contested  square,  the  opponent  is  likely  to  get  the  square  (and  vice-versa). 


•  Look  for  sequences  of  moves  with  the  desired  result. 

•  Maximize  the  player's  mobility,  and  minimize  the  opponent's  mobility.  A  player  with  low  mobility 


Number  of  Legal  Moves 
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can  be  forced  into  taking  undesirable  squares.  A  player  with  high  mobility  has  a  belter  chance  of 
finding  sequences  of  moves  with  the  desired  result. 

3.2.2.  The  middle  game 

During  the  middle  game,  mobility  -  flexibility  in  the  choice  of  moves  --  is  the  critical  strategic  concern, 
l  ack  of  mobility  can  lead  a  player  into  two  distinct  difficulties,  flic  game  at  the  Santa  Cru/  Open  between 
logo  and  The  Moor  ( recall  figure  3-1)  illustrates  one  type  of  difficulty,  logo  took  advantage  of  the  early  loss 
of  mobility  by  The  Moor ,  by  consistently  leaving  The  Moor  w  till  a  choice  among  a  small  set  of  moves.  When 
possible,  the  set  included  only  bad  moves,  resulting  in  a  gradually  deteriorating  position  for  The  Moor. 
f  igure  3- 10  shows  the  number  of  legal  moves  for  each  side  in  the  game. 


Figure  3-10:  'Die  number  of  legal  moves 
/jgo(B-Sl)  vs.  TheMoor( W-13) 

Santa  Cruz  Open  Machine  Othello  Tournament 

As  an  alternative  to  using  a  mobility  deficiency  to  force  bad  moves,  the  opponent’s  moves  can  be 
eliminated  entirely.  The  game  at  the  Northwestern  tournament  between  logo  and  Hiroshi  Inoue  (recall 
Figure  3-3)  illustrates  how  this  can  lead  to  the  second  difficulty,  logo  s  mobility  was  restricted  for  most  of  the 
latter  part  of  the  game.  Inoue  took  advantage  of  this  by  making  six  of  the  last  seven  moves  in  the  game, 
generating  a  large  number  of  stable  discs  at  the  last  minute. 

Whichever  tack  is  taken,  the  development  of  a  mobility  advantage  becomes  a  cnicial  strategic  concern  in 
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the  middle  game.  However,  it  is  dear  that  this  involves  more  than  just  m.uiim/ing  the  difference  in  the 
current  nuntber  of  moves  available  to  die  players.  First,  the  desirability  of  the  moves  matters.  A  player  who 
has  a  large  number  of  moves,  all  of  w  hich  arc  undesirable,  suffers  from  the  first  difficulty  desc  ribed  above  -- 
he  must  make  bad  moves.  The  other  problem  with  the  simple  notion  of  mobility  is  that,  in  addition  to 
immediate  mobility,  the  players  must  consider  what  other  fciturcs  of  the  ho.uJ  will  lead  to  a  long-term 
mobility  advantage.  In  all,  there  are  direc  aspects  ol  die  board  dial  impact  mobility. 

•  The  current  number  of  acceptable 7  moves.  The  difference  in  diis  value  for  the  two  players 
determines  who  has  the  immediate  mobility  advantage.  In  addition,  all  else  being  equal,  the 
advantage  will  propagate  sonic  distance  into  die  future,  impacting  future  mobility. 

•  The  current  number  of  unacceptable  moves.  While  of  little  use  in  the  immediate  situation,  given 
time,  unacceptable  moves  can  become  acceptable.  For  example,  an  x-square  move  becomes 
acceptable  once  cidicr  the  adjacent  corner  is  occupied,  or  it  no  longer  matters  if  the  opponent 
takes  the  corner,  or  the  opponent  has  no  way  of  flipping  die  x-square  disc  along  the  diagonal  (if  he 
has  no  discs  on  the  diagonal,  and  no  way  of  getting  any).  I  his  proves  to  lie  a  common  way  of 
squeezing  out  a  last  one  or  two  reasonable  moves. 

•  The  mobility  potential  of  other  board  configurations.  According  to  the  rules  of  Othello,  a  formation 
consisting  of  (1)  an  empty  square,  followed  by  (2)  a  solid  string  of  the  opponent's  discs,  followed 
by  (3)  one  of  the  player’s  discs,  is  required  for  a  legal  move  in  the  empty  square.  By  working  on 
each  aspect  independently,  at  some  later  point  there  should  be  a  payoff  in  terms  of  an  additional 
advantage  in  the  number  of  moves.  For  example,  one  aspect  involves  discs  that  arc  adjacent  to 
empty  squares  (frontier  discs).  Acquisition  of  frontier  discs  limits  the  long-term  mobility  of  the 
player  while  allowing  the  opponent  to  increase  his  mobility.  Figure  3-11  shows  an  example  from 
the  game  at  the  Santa  Cm/  Open  between  logo  and  Reversi  Challenger.  Reversi  Challenger  has 
built  a  wall  of  frontier  disks  along  the  top.  disallowing  any  moves  by  it  in  the  top  half  of  the  board 
until  after  logo  breaches  the  wall.  logo,  on  the  other  hand,  has  its  choice  of  moves  across  die  wall. 

3.2.3.  Th«  opening  game 

The  opening  game  should  be  based  upon  a  book  containing  successful  opening  sequences  of  moves,  but 
little  has  been  published  (at  least  in  Fnglish)  on  the  subject8.  However,  there  is  agreement  on  the  first  two 
moves  in  the  game.  'Hie  first  move  (by  Black)  is  a  choice  between  four  symmetric  moves  (dffi,  e6.  and  c4)\ 
the  choice  is  arbitrary.  White  selects  the  initial  line  of  play  with  his  first  move  (the  second  move  in  the  game). 
He  has  three  alternatives:  (1)  the  perpendicular  opening  (Figure  3- 1 2a);  (2)  the  diagonal  opening  (l  igurc  3- 
12b);  and  (3)  the  parallel  opening  (Figure  3- 12c).  The  perpendicular  and  diagonal  openings  show  up  in 
expert  Japanese  play  [Albion  80);  the  parallel  opening  does  not. 


7 

Rigorously  defining  ’’acceptable"  is  problematic:  it  is  used  loosely  here  to  refer  to  moves  that  do  not  have  disastrous  consequences. 

g 

Albion  (1980)  contains  a  description  of  one  opening  (the  Maruoka  opening),  and  some  variations. 
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Figure  >1!:  A  wall  trapping  black  in  the  lower  half  of  the  board 
Reversi  Challenger  (Hi)  \s.  lago  (W-(>0) 

Santa  Cm/  Open  Machine  Othello  Tournament 
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(a)  ITic  perpendicular  opening 
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(b)  I  he  diagonal  opening 
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(c)  The  parallel  opening 
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Figure  >12:  Initial  game  records  for  white’s  three  opening  moves 
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4.  lago 

llic  primary  task  faced  by  lago  is  to  select  the  best  move,  whenever  it  is  its  turn  to  play.  I  o  accomplish 
this,  lago  is  built  around  aft  search  (see  Nilsson  (1971)).  and  an  evaluation  based  on  the  Othello  knowledge 
described  in  Section  3.  These  two  components  form  the  heart  of  lago  (and  this  section),  but  they  arc  not 
sufficient  alone  because  of  the  time  constraints  under  which  Othello  games  (at  lea-t  during  tournaments)  arc 
played.  Care  must  be  taken  so  that  lago  can  complete  all  of  its  moves  within  its  allocated  tune  (30  minutes  in 

tournament  games). 


4.1 .  Time  allocation  in  lago 

During  torn  lament  matches,  lago  must  allocate  its  30  minutes  effectively.  This  would  yield  one  minute 
per  move  if  it  were  spread  equally  over  all  of  the  moves  (assuming  that  lago  makes  half  of  the  moves),  l  ive 
factors  make  this  simple  approach  inadequate: 

1.  Time  that  is  left  unused  by  one  move  selection  can  be  fruitfully  added  to  later  moves. 

2.  There  is  overhead  time  for  disc  flipping  (on  the  physical  board),  typing  in  moves,  and  hitting  the 
button  on  die  chess  clock.  This  time  is  charged  to  lago.  in  addition  to  the  time  taken  to  select  a 

move. 

3.  Some  time  must  be  left  in  case  hardware  problems  arise  during  the  match.  Tournament  rules 
vary,  but  computer  downtime  is  frequently  charged  to  the  program  s  clock. 

4.  lago  may  have  to  make  more  than  the  standard  30  moves  (in  case  its  opponent  can  t  move  at  some 
point  in  the  game). 

5.  The  value  of  additional  clock  time  varies  across  moves.  For  example,  it  can  pay  to  allocate  more 
time  to  end-game  moves  if  that  allows  the  positions  to  be  completely  solved. 

Before  beginning  each  move  selection,  lago  generates  a  time  allocation  for  the  selection.  Ihis  allocation  is 
computed  by  multiplying  the  amount  of  time  lago  has  remaining  on  its  clock  by  a  pre-computcd  allocation 
factor  for  the  current  point  in  the  game  (move  number).  These  factors  were  computed  by  the  following 

procedure: 

1.  A  marginal  value  was  assigned  to  each  move  number,  reflecting  the  importance  of  that  point  in 
the  game. 

2.  The  values  were  then  normalized  so  that  die  odd  and  even  fractions  would  separately  sum  to  1 
(corresponding  to  plays  by  black  and  white  respectively). 

3.  An  allocation  fraction  for  each  move  number  was  dicn  computed  as  die  ratio  of  its  marginal  value 
to  the  sum  of  the  marginal  values  from  dicrc  to  the  end  of  the  game. 
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4.  Some  tractions  were  then  reduced  hy  hand  to  leave  room  (or  more  than  30  moves'*. 

The  resulting  normalized  marginal  allocations  are  show  n  in  Figure  4-1.  The  raggedness  near  the  end  of  the 
curve  is  a  result  of  the  hand  modifications.  The  first  move  has  an  abnormally  small  allocation  because  it  is  a 
choice  between  four  symmetric  moves.  Any  time  spent  on  this  selection  is  a  waste,  and  there  is  no  opening 
book  in  which  to  prestore  a  selection. 


Figure  4-1 :  Marginal  time  allocation  fractions  (normalized)  by  move  number 

The  time  allocation  is  generated  by  multiplying  the  amount  of  time  left  on  the  clock  by  the  allocation 
fraction.  This  procedure  takes  care  of  points  1, 4,  and  5.  Potential  hardware  problems  (point  3)  arc  alleviated 
by  removing  2.5  minutes  from  lagos  internal  clock  at  the  beginning  of  the  match,  leaving  it  with  only  27.5 
minutes  available  for  allocation.  Overhead  time  (point  2)  is  handled  by  subtracting  10  seconds  from  the  time 
allocated  for  the  move  selection.  These  10  seconds  arc  then  used  by  the  overhead  activities  instead  of  by  die 
move  selection  procedure. 


o 

This  was  a  post-hoc  solution  that  became  necessary  during  the  Santa  Cruz  Open.  There  were  several  games  in  which  fago  nearly  ran 
out  of  time  (using  28  of  its  30  minutes). 
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4.2.  Search  in  iago 

The  basic  search  procedure  in  Iago  consists  of  an  implementation  of  a  full -width  reu;r-.i\e  up  algorithm. 
The  root  node  of  die  search  tree  returns  the  move  that  must  be  made  in  order  to  reach  its  best  (determined  by 
the  backed-up  evaluation)  child  node  (the  first- ply  nodes).  The  remaining  nodes  back  up  the  evaluation  from 
the  terminal  nodes  of  the  tree. 

Iterative  deepening  [Slate  &  Atkin  77|  is  used  to  determine  the  depth  to  which  Iago  can  search  during  the 
time  allocated.  Iago  performs  a  complete  1-ply  search,  followed  by  a  complete  2-ply  search,  and  so  on.  Iago 
tries  another  ply  when  it  estimates  that  it  can  completely  search  at  least  Min{3,  number  of  first-ply  nodes}  of 
die  first-ply  nodes  at  die  new  depth.  In  order  to  make  this  estimation.  Iago  keeps  track  of  the  amount  of  time 
spent  for  each  iteration,  and  the  average  branching  factor  for  each  player.  Using  this  information,  it 
extrapolates  the  expected  time  to  complete  another  iteration. 

F.vcn  when  Iago  estimates  dial  it  can  finish  another  iteration,  the  amount  of  time  required  to  finish  might 
far  exceed  the  time  allocated  for  that  search.  To  handle  this,  Iago  checks  its  progress  between  each  first-ply 
node,  and  aborts  the  search  if  it  estimates  that  the  time  to  complete  one  more  first-ply  node  will  exceed  1.3 
times  the  allocated  umc.  During  each  search,  the  first-ply  nodes  arc  ordered  according  to  their  values. 
Because  this  ordering  is  used  to  establish  die  search  order  for  the  next  iteration,  even  if  die  search  is  aborted 
after  only  one  first-ply  node,  it  is  safe  to  pick  the  best  move  found  during  the  current  iteration. 

Iago  currently  searches  to  an  average  depth  of  6.3  (i.c.  searches  arc  completed  for  .3  of  die  first-ply  nodes  at 
a  depth  of  7)  .  Deeper  searches  do  result  in  better  performance,  but  increasing  die  dcpdi  of  search  is  not  a 
panacea.  In  die  Northwestern  tournament,  the  two  worst  programs  searched  to  an  average  dcpdi  of  7-8  ply, 
while  the  better  programs  searched  to  only  4  or  5  ply. 

4.2.1 .  Increasing  the  efficiency  of  the  search 

logo's  search  depth  has  been  increased  by:  (1)  upgrading  die  efficiency  of  the  evaluation  function;  (2) 
performing  all  computations  as  close  to  the  root  node  as  possible  ( incremental  updating  of  the  evaluation 
function):  and  (3)  decreasing  the  number  of  nodes  that  the  search  must  examine.  The  techniques  included  in 
(1)  arc  discussed  along  with  the  evaluation  function  itself  in  Section  4.4. 

Because  each  additional  ply  in  the  search  adds  a  loop  nested  within  the  current  innermost  loop  of  the 
computation,  technique  (2)  is  equivalent  to  the  optimization  technique  of  moving  code  out  of  loops.  Many  of 
the  values  that  arc  needed  in  the  evaluation  function  can  be  computed  by  initializing  them  at  the  beginning  of 
the  search,  and  updating  diem  as  the  search  proceeds.  This  procedure  takes  computations  that  would 
otherwise  be  done  completely  at  the  terminal  nodes  of  the  search  (inside  the  innermost  loop),  and  spreads 
them  out  over  the  entire  search  tree  (moving  much  of  the  effort  out  of  the  inner  loops). 


4  IALiO  ?l 

I  he  til i rd  technique  is  where  the  it-fi  algoiithm  is  so  important.  With  a  30  minute  time  allocution  in  which 
to  make  its  mines  (tournament  situation,  but  with  no  overhead  charges).  lago examines  an  average  of  10000 
terminal  nodes  per  move10  (counting  the  terminal  nodes  from  all  iterations  of  the  search).  The  average  raw 
branching  factor  is  9.9.  so  with  a  simple  minimax  search  (no  iterative  deepening),  lago could  only  achieve  a 
search  depth  of  4.0.  lago  achieves  its  average  search  depth  of  6.3  by  lowering  the  branching  factor  to  4.0.  The 
minimal  attainable  branching  factor  for  a-p  is  3.7  (sec  Slagle  &  Dixon  (1969)  for  die  defining  lormula). 

Since  the  order  in  which  nodes  arc  searched  is  crucial  to  the  efficiency  of  the  a- ft  algorithm,  lago  devotes 
considerable  resources  to  improving  on  random  ordering,  lago  gathers  information  about  node  ordering 
during  iterative  deepening.  The  top  Uircc  ply  of  the  search  tree  are  ordered  and  saved  once  dicy  have  been 
searched  --  guaranteeing  perfect  ordering  (according  to  die  information  obtained  so  far)  for  die  critical  top 
portion  of  the  tree,  below  the  top  three  ply,  lago  maintains  a  response  killer  table  (similar  to  the  refutation 
data  used  by  Cichelli  (1973a)).  Tor  each  possible  move  (combination  of  square  and  player),  the  table  contains 
a  list  of  all  possible  responses,  and  a  rating  for  each  response.  The  ratings  are  computed  incrementally  over 
iterations  of  the  „carch  (and  across  successive  moves  in  the  game,  though  dicy  are  halved  between  moves  to 
emphasize  the  effect  of  killers  found  on  the  current  move)  by  boosting  the  response  value  by  a  small  amount 
whenever  it  is  a  legal  response,  and  by  a  large  amount  whenever  it  is  the  best  response  (or  the  cause  of  a  cutoff 
in  the  search).  Ihcrcforc,  lago  can  generate  an  ordering  at  each  node  by  looking  at  the  ratings  of  die 
responses  to  die  move  made  at  that  node. 

A  two-factor  experiment  (depth  to  which  the  search  tree  was  saved  versus  the  ordering  heuristic  used  for 
the  remainder  of  the  search  tree)  was  performed  to  evaluate  the  effectiveness  of  these  techniques  (see  Gillogly 
(1978)  for  an  analysis  of  the  effectiveness  of  search  heuristics  in  chess).  Three  different  depths  were  used:  0,  1, 
and  3.  In  the  remainder  of  the  tree,  four  ordering  heuristics  were  tried:  (1)  a  fixed  ordering  of  the  nodes 
based  on  the  "natural"  order  of  move  generation  (lefl-to-right  across  the  board  and  top-to-bottom);  (2)  a  fixed 
ordering  according  to  the  "value"  of  the  squares  (e.g.  corner  squares  are  first);  (3)  a  killer  table  based  on  who 
is  to  move,  and  the  move  number  (similar  to  the  killer  heuristic  first  suggested  by  McCarthy  in 
1957  [McCarthy  81]);  and  (4)  the  response  killer  table. 

The  twelve  conditions  (3  x  4)  were  all  evaluated  on  the  same  set  of  six  data  points  selected  from  five  games 
at  the  most  recent  world  championship  tournament  [Prentice  81].  Two  points  were  selected  randomly  from 
moves  1-15  of  the  five  games,  two  from  moves  16-30,  and  two  from  moves  31-45.  Tor  each  point,  the  search 
depth  was  set  so  that  the  standard  condition  (tree  depth  of  3,  response  killers)  would  use  about  1  minute,  lTiis 

'°This  value  (as  well  as  the  value  for  the  branching  factor)  was  computed  by  averaging  the  values  over  the  first  46  moves  of  a  game 
played  by  lago  against  itself  (end-game  searches  were  done  from  move  47  until  the  end  of  the  game  (see  Section  4.3)). 
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depth  Mat  llicil  used  lot  all  twelve  conditions.  I  he  mean  (.md  st.mdaid  deviation)  wl  llie  blanching  taclwi  Iwr 
each  condition  (averaged  over  the  6  pvmits)  can  he  found  in  Cable  4  1.  I  lie  minimal  wbiaiiuhlc  brandling 
factor  for  the  six  points  is  4.9  (standard  deviation  ofO.S).  I  Ire  best  condition  (search  tree  saved  to  a  depth  of  3. 
with  the  response  killer  table  used  in  llie  remainder  of  the  tree),  is  the  one  used  in  lago.  I  his  still  leaves  room 
for  improvement,  as  the  branching  factor  for  this  condition  (4.1)  differs  r  infic  mllv  from  the  optimal  value  (p 

<  0.05). 


Depth 

Saved 

Ordering  Heuristic 

Fixed  Order  Killer  l  ablc 

Move  Generation  Square  Value  Move  Number  Response 

Marginal 

0 

5.6  (1.3) 

5.4  (1.8) 

HER  USUI 

4.2  (0.8) 

49(1.3) 

1 

5.5  0.2) 

5.10.6) 

wiiH 

4.8  (1.2) 

3 

4.7  (0.9) 

4.4  (1.0) 

1 

4.1  (0.8) 

4.3  (0.8) 

Marginal 

5.2  (l.l) 

4.9  (1.5) 

4.4  (0.7) 

4.2  (0.8) 

4.7  (1.1) 

Table  4-1:  Kvaluation  of  a- ft  ordering  heuristics  in  lago. 

Three  depths  to  which  the  search  tree  is  saved  x  four  ordering  heuristics 
Mean  (and  standard  deviation)  of  the  branching  factor  for  each  condition. 

Looking  at  the  marginal  means,  Uicre  is  a  dear  trend  along  both  dimensions:  the  more  ordering 
information  available,  the  lower  the  branching  factor.  Along  one  dimension,  as  the  depth  to  which  die  search 
tree  is  saved  increases,  the  branching  factor  decreases.  Along  the  other  dimension,  both  fixed  ordering 
schemes  maintain  a  single  ordered  list  of  squares  (the  square-value  list  has  effectively  more  information 
because  the  ordering  is  based  on  knowledge  of  die  game),  while  the  move-number  killer  table  requires  up  to 
15  (the  maximum  search  depth)  lists,  and  the  response  killer  table  requires  up  to  60  (the  number  of  squares  in 
which  it  is  possible  to  play)  lists. 

While  the  trends  arc  clear,  not  all  of  the  results  achieve  statistical  significance11.  There  is  a  significant 
difference  between  tree  depths  of  1  and  3  (p  <  0.001),  but  the  difference  between  0  and  1  did  not  achieve 
significance  (though  p  <  0.1  docs  hold).  Along  the  other  dimension,  the  two  fixed  ordering  schemes  do  not 
significantly  differ  ( p  <  0.15),  and  the  same  holds  for  the  two  killer  tables  (p  <  0.1).  When  die  fixed  ordering 
data  is  pooled  and  compared  against  die  pooled  killer  data,  the  difference  is  significant  ( p  <  0.005). 

1  ’the  data  were  normalized  before  they  were  checked  for  significance,  by  dividing  each  value  by  the  branching  factor  for  llie  control 
condition  (none  of  the  search  tree  was  saved,  and  (he  move-general  ion  fixed  ordering  was  used). 
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4.3.  Solving  end-game  positions 

lago  is  capable  of  solv  ing  (through  a  complete  search  to  the  end  of  die  game)  a  significant  number  of  end 
game  positions.  In  this  context,  solving  a  position  can  have  two  distinct  meanings,  corresponding  to  the  two 
top-level  goals  in  Othello  (Section  3.2):  ( 1 )  to  find  the  move  w  hich  maximizes  the  final  disc  differential;  or  (2) 
to  find  any  move  which  yields  the  best  possible  value  from  the  set  { win.  lie.  /ovvj.  la‘;o  lus  a  distint  t  (from 
each  other,  and  from  the  evaluation  function  used  in  die  earlier  part  of  the  game)  evaluation  function  for  each 
of  these  goals,  consisting  of  the  obvious  measure  for  each  interpretation:  ( 1)  the  final  disc  differential;  and  (2) 
a  three  valued  measure  (from  the  set  { win,  tie,  loss}). 

(-'valuation  (1)  is  logically  sufficient  for  both  top-level  goals,  but  evaluation  (2)  can  be  employed  earlier  in 
the  game,  lago  is  capable  of  performing  a  complete  end-game  search  with  evaluation  (2)  at  around  move  46 
(15  moves  from  the  end  of  the  game).  Employing  evaluation  (1)  is  usually  feasible  at  move  48  (13  Moves 
from  the  end).  Two  aspects  of  K valuation  (2)  arc  responsible  for  its  ability  to  be  used  prior  to  Evaluation  ( 1). 
Because  evaluation  (2)  can  return  only  one  possible  value  for  a  winning  position,  cutoffs  abound,  once  the 
first  winning  path  has  been  found  in  the  search  tree.  In  addition,  the  search  order  when  evaluation  (2)  is 
employed  should  be  nearer  to  optimal,  as  it  is  only  necessary  to  assure  that  the  winning  moves  precede  the  tics 
and  losses.  Extra  search  is  not  required  just  because  die  best  winning  move  was  not  searched  first. 

lago  decides  dynamically  which  evaluation  to  use.  At  every  iteration  in  die  search,  a  decision  is  made 
whether  to  go  one  deeper  with  the  normal  evaluation,  use  evaluation  (1).  use  evaluation  (2).  or  terminate  the 
search.  The  decision  as  to  which  search  to  perform,  is  based  on  empirically  determined  equivalences  in  search 
time.  Using  eidicr  evaluation  ( 1 )  or  (2)  the  search  can  get  deeper  than  is  possible  with  the  normal  evaluation 
because  of  the  smaller  time  costs  of  the  evaluation  functions  involved.  If  the  search  depth  for  the  current 
iteration  is  n,  die  next  iteration  will  be  cither  a  search  with  the  normal  evaluation  to  a  depth  of  n  + 1 ;  a  search 
with  evaluation  (1)  to  a  depth  of  n+ 5;  or  one  with  evaluation  (2)  to  a  depth  of  n  +  7.  The  search  times  for 
these  alternatives  arc  approximately  equal.  ITie  decision  algorithm  is: 

If  there  is  not  enough  time  for  another  iteration 
Then  terminate  the  search 

The  If  the  current  position  is  within  n  +  5  moves  of  the  end  of  the  game 
Then  use  evaluation  (1) 

Else  If  the  current  position  is  within  n  +  7  moves  of  the  end  of  the  game 

And  there  is  not  enough  time  to  do  two  more  iterations  and  evaluation  (1) 

Then  use  evaluation  (2) 

Else  complete  another  iteration  with  the  normal  evaluation. 
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4.4.  lago’s  evaluation  function 

When  lago  is  unable  to  search  to  the  end  of  the  game,  it  makes  use  of  a  single  evaluation  function 

consisting  of  four  components  based  on  the  analysis  of  Othello  strategy  in  Section  3.  The  components  arc 

weighted  by  application  coefficients  [Berliner  80).  and  then  summed  to  yield  a  single  value  for  the  evaluation. 

'Hie  coefficients  used  in  lago  were  based  initially  on  opinions  about  the  i  cl  dive  importance  of  the  individual 

components.  A  limited  set  of  variations  on  these  values  was  then  evaluated  by  playing  these  differing  versions 

of  lago  against  each  other.  The  best  variation  became  logo's  evaluation  function12: 

Fval(pos)  =  YS\C(MoveNumber)x  EdgeStabilily  4-  26x  Internals  lability  + 

CM.\C(  MoveNumber)xCurrent  Mobility  +  99x  Potent  ialMobility 

Two  of  the  application  coefficients  vary  with  move  number  to  reflect  the  relative  importance  of  those 

components  during  different  stages  of  the  game. 

KSAC(  Move  Number)  =  312  +  6.24X  Move  Number  1  <  MoveNumber  <  60 

CM  \C( MoveNumber)  =  50  +  IxMoveNumber  1  <  MoveNumber  <  25 

=  75  +  MoveNumber  25  <  MoveNumber  <  60 

Notice  that  the  edge-stability  application  coefficient  is  almost  an  order  of  magnitude  greater  than  any  of  the 

others.  This  insures  that  nontrivial  edge-stability  values  will  dominate  the  values  from  the  other  components, 

moving  lago  automatically  into  the  end-game. 

The  value  for  logo's  evaluation  function  (on  its  moves)  in  the  game  at  the  Santa  Cruz  Open  vs.  Reversi 
Challenger  s  shown  in  Figure  4-2.  These  values  were  determined  by  a  post-hoc  analysis  of  the  game  record11. 
The  graph  terminates  (after  move  46)  when  lago  is  able  to  exactly  solve  an  end-game  position  (Section  4.3). 

The  remainder  of  this  section  is  devoted  to  descriptions  of  the  four  components  of  the  evaluation  and  a 
brief  discussion  of  the  opening  game,  a  weakness  in  logo. 

4.4.1.  Stability 

logo's  evaluation  function  has  two  stability  components:  (1)  EdgeSlability  --  the  corner  and  edge  squares, 
and  their  interactions  with  x-squares;  and  (2)  Internals  lability  -  the  stability  of  non-edge  squares. 


12To  facilitate  comparisons  in  this  presentation,  the  component  ranges  have  all  been  normalized  to  [-1000,  1000],  with  the  differences 
moved  into  the  application  coefficients. 

H/tgo  is  set  up  to  compute  statistics  for  each  position  that  occurred  in  a  game.  Tournament  time  allocations  arc  set  (except  that  the  10 
seconds  per  move  for  overhead  is  not  charged),  and  a  search  is  performed  at  each  position.  During  the  search,  /ago  saves  information 
such  as  the  value  returned  by  the  search,  the  move  that  lago  would  have  made,  and  the  values  of  the  components  of  the  evaluation. 
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Figure  4-2:  logo's  evaluation  function  on  its  moves 
Reversi  Challenger  ( B-3)  vs.  logo  (W-60) 
Santa  Cruz  Open  Machine  Othello  Tournament 


4.4. 1.1.  Edge  stability 

For  computations  of  edge  stability,  logo  assumes  that  each  side  of  the  board  can  be  treated  independently. 
'Hie  6561  (38)  possible  configurations  of  eight  edge  squares  (including  the  two  adjoining  corners),  are  handled 
by  a  precomputed  table.  Each  element  in  the  table  reflects  the  value  of  one  edge  configuration  for  black, 
assuming  that  he  has  the  next  move14.  Values  for  the  three  other  possibilities  (the  value  for  black  with  white 
to  move,  the  value  for  white  with  white  to  move,  and  the  value  for  white  with  black  to  move)  arc  all  retrieved 
from  this  table.  The  value  for  white  is  the  negative  of  the  value  for  black,  and  the  values  for  when  white  has 
the  next  move  are  determined  from  the  inverse  configuration  (all  white  discs  changed  to  black,  and  all  black 
to  white)  with  black  to  move. 

The  table  is  precomputed  by  an  iterative  algorithm.  The  first  step  is  to  initialize  the  table  with  a  set  of  static 
values.  Each  disc  in  a  configuration  has  a  weight  that  depends  on  the  square  it  is  in  (such  as  corner  or  c- 
square),  and  the  stability  of  the  disc  --  either  stable,  semi-stable  (cannot  be  flipped  on  the  next  move),  or 


14 

This  assumption  is  important  because  of  the  asymmetry  inherent  in  many  side  configurations.  Recall  Figure  3-3b:  because  white  has 
the  next  move,  he  is  able  to  move  to  cS  followed  by  hS,  stabilizing  seven  of  the  squares  along  the  bottom  edge.  If  black  had  had  the  next 
move,  he  could  have  turned  it  around  by  moving  to  c8,  stabilizing  seven  of  the  bottom  edge  squares  for  himself. 
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unstable  (Tabic  4-2).  I  lie  static  value  of  die  configuration  is  the  sum  of  these  weights  (positive  lor  black  discs 
and  negative  for  white  discs). 


Stable 

Semi-stable 

Unstable 

Corner 

700 

* 

* 

C-Squarc 

1200 

200 

-25 

A-Squarc 

1000 

200 

75 

B-Squarc 

1000 

200 

50 

*  Impossible  case  since  corners  must  be  stable 


Table  4-2:  The  weights  for  edge  discs. 

Each  iteration  of  die  algorithm  starts  w  ith  the  completely  filled  configurations  (all  discs  stable)  and  works 
backwards  to  intermediate  positions  by  removing  discs  from  these  positions.  These  intermediate  positions  arc 
evaluated  through  the  computation  of  the  expected  value  of  the  position,  l-or  each  empty  square,  the  value 
for  when  black  plays  there  (computed  by  playing  a  black  disc  and  retrieving  die  value  for  dial  position  from 
the  table)  is  multiplied  by  the  probability  of  being  able  to  make  that  move.  The  probability  is  1  if  black  can 
move  there  by  (lipping  an  edge  disc,  otherwise  it  is  0  for  corner  squares  and  a  function  of  the  neighboring 
discs  for  the  other  six  squares,  black  is  also  allowed  to  avoid  making  a  move  on  die  edge  (leaving  the  edge 
configuration  as  it  is)  with  a  probability  of  1.  The  best  alternative  is  chosen  as  the  value  of  the  configuration. 
As  values  arc  assigned  to  intermediate  positions,  new  positions  are  generated  bv  removing  discs  from  the  ones 
already  evaluated.  Ihc  iteration  is  complete  when  a  value  has  hecn  assigned  to  the  empty  configuration.  The 
table  used  in  lago  was  the  result  of  five  iterations. 

The  current  table  is  adequate,  but  overly  greedy.  An  attempt  to  rectify  this  problem  by  using  a  more 
sophisticated  generation  algorithm,  led  to  a  table  of  values  that  was  less  greedy,  and  played  a  better  edge 
game.  Unfortunately,  because  of  a  problem  to  be  discussed  in  Section  4  4.2.2.  the  non-greediness  of  this  new 
table  caused  more  problems  than  it  solved. 

A  more  serious  problem  occurs  when  the  assumption  of  independence  breaks  down,  primarily  at  corners. 
One  such  problem  stems  from  the  table’s  ignorance  of  the  effect  of  x-squarc  discs  on  corner  squares  (and  by 
extension,  to  the  adjacent  edges).  Through  most  of  the  game,  playing  a  disc  in  an  x-squarc  practically  assures 
the  opponent  access  to  the  adjacent  corner.  Plough  this  can  greatly  afTect  the  edge  values,  this  effect  is  not 
handled  by  the  edge  table.  If  corners  were  always  good  and  x-squares  always  bad.  x-squarcs  could  simply  be 
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avoided.  There  arc.  however,  major  exceptions  to  this  rule.  As  the  situation  in  I  igure  3-y  shows,  black's  move 
to  g2  (an  x-square),  at  move  45.  is  an  excellent  move.  Following  white’s  comer  move  at  hi,  black  is  ahlc  to 
stabilize  three  sides  of  the  board.  Under  such  situations  it  is  often  advisable  to  make  the  x-squarc  move.  If 
the  opponent  is  low  in  mobility,  he  will  be  forced  to  take  the  corner.  At  worst,  it  gives  the  player  one  more 
safe  move,  which  can  be  critical  late  in  the  game. 

/ago  overcomes  this  weakness  in  the  edge  table  by  explicitly  evaluating  the  interactions  between  x-squarcs 
and  corners  (and  edges).  It  compares  the  expected  cost  of  making  an  x-squarc  move  with  the  value  of  not 
making  it.  The  expected  cost  is  computed  by  estimating  the  probability  of  the  x-squarc  being  used  as  an 
access  route  to  the  corner  square,  and  the  cost  of  giving  up  the  corner  square.  The  estimation  of  the 
probability  (/>)  breaks  down  into  three  cases. 

•  If  the  adjacent  corner  is  occupied,  the  probability  is  0. 

•  If  the  opponent  can  immediately  flip  into  the  comer  by  way  of  the  x-squarc,  the  probability  is 
assumed  to  be  l15. 

•  Otherwise,  /ago  assumes  the  probability  to  be  a  decreasing  function  of  move  number 
(1  -  MovcNutnber/\20).  As  the  move  number  increases,  there  arc  fewer  squares  in  which  it  is 
possible  to  play,  increasing  the  difficulty  of  utilizing  the  x-square. 

The  cost  of  giving  up  the  corner  square  (assuming  it  is  empty)  is  computed  by  performing  a  small  search  at 
those  terminal  nodes  of  the  regular  search  in  which  an  x-squarc  is  occupied,  while  the  adjacent  comer  is 
empty.  A  disc  of  opposite  color  from  the  x-squarc  disc,  is  played  in  the  corner  square  (even  if  it  is  not  a  legal 
move),  and  the  appropriate  edge  discs  arc  flipped16  along  the  two  adjacent  edges.  Ihe  values  for  these  two 
edges  arc  combined  into  a  single  value  for  the  corner.  It  is  assumed  that  the  player  w  ith  the  next  move  will  be 
able  to  take  his  choice  of  which  side  to  play  into;  so  the  value  for  that  side  (with  that  player  to  move  next)  is 
added  to  the  value  of  the  adjacent  side  (with  the  other  player  able  to  move  first).  This  composite  value  is 
compared  to  the  composite  value  for  the  existing  edge  configurations;  if  the  new  value  (for  the  player  who 
made  the  x-squarc  move)  is  less  than  the  current  value,  the  expected  value  (px  New  Value  +  (1- 
p)xF.xistingValue)  is  used  instead  of  the  existing  one17. 


16 Due  to  an  oversight,  this  condmon  was  not  actually  tested 

16The  discs  arc  not  actually  flipped  Instead,  an  index  (based  on  the  color  of  the  disc  to  be  played  in  the  corner  and  the  current  edge 
configuration)  is  kept  to  the  new  value  in  the  edge  table 

17There  is  an  additional  multiplicative  factor  tn  the  expected  value  that  corrects  for  the  fact  that  the  effect  of  a  corner  disc  on  an  edge 
value  is  counted  twice  when  the  disc  is  actually  on  the  board  (each  edge  affects  two  of  the  four  corner  values),  but  only  once  when  it  it 
placed  there  during  the  x  square  evaluation 


Weighted  Value 
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The  four  corner  values  arc  summed  to  yield  a  single  value  for  die  edge  of  die  board.  figure  4-3  depicts  the 
trajectory  of  the  edge  value  for  logo's  game  in  the  Santa  Cm/  Open  against  Reversi  Challenger,  fllic  value 
takes  off  after  move  30.  because  logo  perceives  that  it  will  gain  a  corner  (figure  4-4)).  These  are  the  weighted 
values  determined  from  logo's  search. 


Figure  4-3:  logo's  weighted  edge  values 
Reversi  Challenger (B-3)  vs.  logo  (W-60) 
Santa  Cruz  Open  Machine  Othello  Tournament 
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Figure  4-4:  After  move  32  --  All  of  black’s  moves  give  up  a  corner 
Reversi  Challenger  (ft-})  vs.  logo  (W-60) 

Santa  Cruz  Open  Machine  Othello  Tournament 
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lTjis  procedure  docs  a  more  than  adequate  job  of  evaluating  interactions  between  x-squares  and  edges,  for 
example,  in  the  game  at  the  Santa  Cruz  Open  between  logo  and  The  Moor  logo  made  three  x-squarc  moves  - 
-  at  moves  29,  34.  and  38.  Each  x-squarc  move  sacrificed  a  corner  square,  but  left  logo  in  a  position  in  which 
it  could  win  handily  51  to  13.  However,  there  are  other  interactions  between  adjacent  edges  that  arc  not 
handled  by  either  the  edge  table,  or  the  x-square  algorithm.  Figure  4-5  shows  a  contrived  situation  in  which 
independence  fails  catastrophically. 
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Figure  4-5:  Contrived  example  of  edge  interactions.  White  to  move. 

The  edge  table  assigns  a  high  value  to  this  position.  IT»c  reason  is  simple;  the  evaluation  of  the  left  edge 
shows  that  white  is  guaranteed  access  to  the  comer  at  al,  and  none  of  the  other  sides  yield  a  certainty  of  black 
gaining  a  corner.  However,  a  little  search  reveals  that  black  will  almost  assuredly  gain  the  remaining  three 
sides  of  the  board  if  white  takes  the  corner.  The  result  is  that  as  long  as  white  avoids  the  corner,  the  edge 
values  are  strongly  in  white’s  favor  (though  they  should  be  fairly  even),  because  of  the  ability  to  take  the 
comer.  However,  should  white  actually  take  the  corner,  its  value  would  plummet  immediately.  Hie  source  of 
this  difficulty  is  that  the  edge  table  assumes  a  corner  is  always  desirable  as  long  as  taking  it  has  no  bad 
consequences  for  that  edge. 

logo  handles  these  situations  only  to  the  extent  that  such  sequences  can  be  found  by  the  normal  a-fi 
search.  'Hie  lack  of  a  mechanism  that  can  evaluate  these  positions  properly  is  one  of  the  major  remaining 
weaknesses  in  logo.  One  possible  future  resolution  to  this  problem  is  to  estimate  dynamically  the  desirability 
of  a  comer  by  a  small  search  like  the  one  used  for  x-squarcs,  and  add  information  to  the  edge  table  about  the 
values  of  edges  when  the  comers  are  undesirable. 
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4.4.1 .2.  Internal  stability 

Determining  which  internal  sciuarcs  arc  stable  is,  in  general,  a  difficult  problem.  However,  it  was  possible 
to  isolate  a  subclass  of  stable  internal  squares  for  which  a  feasible  detection  algorithm  could  be  devised,  lire 
algorithm  is  based  on  the  fact  that  an  internal  disc  is  stable  if  there  is  a  stable  disc  of  the  same  color  adjacent  to 
it  on  each  of  the  four  lines  through  the  square,  live  algorithm  makes  use  of  two  data  structures:  a  set  of 
unprocessed  stable  squares,  and  a  boolean  vector  which  tells  for  each  square  whether  there  is  a  stable  disc  on 
it.  The  algorithm  is: 

1.  Initialize  the  set  to  be  empty  and  all  elements  of  the  vector  to  be  false  (all  squares  unstable). 

2.  For  each  occupied  corner  square,  mark  it  as  stable,  and  add  that  square  to  the  set  for  future 
processing. 

3.  While  the  set  is  not  empty,  remove  an  element  and  process  it: 

For  each  square  adjacent  to  the  element,  if  it  is  stable  (that  is,  there  is 
a  stable  disc  of  the  same  color,  or  an  edge  of  the  board,  adjacent  to  it  along 
each  of  the  four  lines)  and  not  so  marked,  then  mark  it  stable  and  add  it  to 
the  set. 

4.  The  result  is  the  difference  of  the  number  of  stable  internal  disks  for  the  two  players  (calculated 
from  the  stability  vector). 

The  utility  of  this  algorithm  has  proven  to  be  marginal.  In  tight  games,  there  arc  generally  not  a  significant 
number  of  squares  with  this  type  of  internal  suability  until  it  is  already  possible  to  exactly  solv  e  the  position; 
so  this  algorithm  has  the  greatest  effect  in  games  in  which  I ago  already  has  an  advantage.  An  additional 
problem  with  this  algorithm  is  that  it  can  miss  a  large  number  of  stable  internal  discs.  Since  there  is  a 
correlation  between  edge  stability  and  the  stability  of  the  near-by  internal  discs,  working  on  edge  stability 
leads  to  the  stabilization  of  internal  discs.  Some  are  found  by  logo's  algorithm,  others  arc  not.  lhese  others 
can  be  troublesome,  as  they  need  not  belong  to  the  player  with  the  stable  edge  squares.  Figure  4-6  shows  a 
position  in  which  logo's  algorithm  finds  only  4  out  of  12  stable  internal  discs.  Two  of  diem  (c/7  and  c7)  belong 
to  black,  even  though  the  stable  edge  and  corner  discs  all  belong  to  white. 

The  cost  of  executing  this  algorithm  is  decreased  by  not  performing  it  during  the  evaluation  function.  By 
executing  it  one  ply  closer  to  the  root  node  of  the  search  tree1*,  the  cost  is  cut  by  a  factor  of  4  (the  branching 
factor).  Computing  an  expensive  component  at  the  terminal  nodes  of  the  search  tree  can  result  in  the  loss  of  a 
ply  or  more  in  the  search,  leading  to  poorer  information  about  all  components.  By  moving  the  computation 
of  the  more  expensive  ones  out  of  the  terminal  nodes,  performance  on  the  other  components  is  maintained  at 


ic  value  is  then  passed  to  all  of  the  terminal  nodes  that  arc  dependents  of  the  node  at  which  the  calculation  us  performed. 
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Figure  4-6:  logo's,  algorithm  determines  b7,  c7.J7 ,  and  g7  to  be  stable. 

In  addition,  bi.  c.l  M,  J5.  c6,  (16,  rf7,  and  c7  arc  stable. 

Reversi  Challenger (B-3)  vs.  lago(W-bQ) 

Santa  Cruz  Open  Machine  Othello  Tournament 

the  higher  level.  In  general,  components  should  be  placed  so  as  to  maximize  the  pcrformancc/cost  ratio  for 
the  evaluation  as  a  whole  (though  the  actual  placement  in  logo  is  somewhat  ad  hoc). 

4.4.2.  Mobility 

According  to  the  general  analysis  of  Othello  (Section  3.2).  mobility  is  the  primary  strategic  concern  during 
the  middle  game  (Section  3.2.2).  logo's  mobility  knowledge  is  divided  into  two  components:  (1) 
CurrentHfobi/ity  the  number  of  moves  currently  available;  and  (2)  Potent  ia!M ability  --  the  potential  of 
current  board  configurations  to  lead  to  future  mobility. 

4.4.2. 1 .  Current  Mobility 

To  evaluate  currently  available  moves,  logo  does  the  obvious,  it  counts  them.  Moves  are  always  counted 
for  a  player  just  prior  to  when  the  player  is  to  move.  Thus,  only  the  value  for  one  player  is  actually  computed 
in  the  evaluation  function.  The  other  player's  moves  arc  counted  just  before  his  last  move  in  the  search 
(usually  one  ply  closer  to  the  root  node).  It  is  assumed  that  each  player  has  a  50%  chance  of  taking  any  square 
that  both  players  have  access  to,  and  100%  chance  for  every  square  to  which  he  has  sole  access.  A  count  is 
generated  for  each  player  by  adding  twice  his  sole  access  squares  to  the  number  of  mutual  access  squares. 
These  two  counts  are  then  non-lincarly  combined  into  a  single  component  value.  If  the  value  for  the  player 
performing  the  evaluation  is  p,  and  the  value  for  the  opponent  is  o,  then  the  value  for  the  component  is: 

CurrcntMobilityfp,  o)  =  Truncate{1000  x  (p  -  o)/(p  +  o  +  2)]  (l) 

A  strict  difference  between  the  two  values  is  not  used  because  having  an  advantage  of  4  to  1  is  more  valuable 
than  an  advantage  of  15  to  11. 

Figure  4-7  traces  the  value  of  the  weighted  measure  over  the  game  between  logo  and  Reversi  Challenger  at 
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tlic  Sanui  Cru/  Open.  In  this  game,  the  transition  from  the  middle  to  the  end  game  ih.luis  between  moves  W) 
and  32,  when  the  advantage  in  legal  moves  translates  into  domination  of  the  edges. 


Figure  4-7;  lagos  weighted  legal  move  measure 
Reversi  Challenger (Wl)  vs  Iago( W-60) 

Santa  Cru/.  Open  Machine  Othello  Tournament 

Legal  move  computation  is  one  of  the  most  costly  portions  of  the  evaluation  function.  Averaged  over  the 
six  positions  used  in  the  evaluation  of  ordering  heuristics  (Section  4.2.1),  the  mean  cost  of  counting  die  legal 
moves  for  one  player  in  the  evaluation  function  is  3.4  msecs,  compared  with  .5  msecs  for  all  of  the  other 
calculations  (yielding  a  total  of  7.3  msecs  per  evaluation  when  legal  moves  are  counted  for  both  players).  By 
doing  the  computation  for  one  player  at  the  nodes  one  closer  to  the  root  node,  the  cost  is  split  over  4  (the 
branching  factor)  terminal  nodes,  yielding  a  net  cost  of  4.75  msec  per  evaluation.  To  minimize  the  cost  even 
further,  legal  moves  are  counted  last,  and  the  evaluation  is  terminated  prematurely  when  either:  (1)  the 
maximum  value  possible  for  this  measure  would  not  lead  to  an  evaluation  better  than  the  current  best  nixie; 
or  (2)  the  minimum  value  possible  for  this  measure  would  lead  to  a  cutoff  in  the  earth  tree.  ITicse  conditions 
are  checked  before  each  square  is  evaluated  for  a  possible  legal  move.  This  optimization  trims  another  0.8 
msecs  per  evaluation,  for  a  net  of  3.95  msecs. 

One  of  fago s  current  weaknesses  is  that  the  legal-move  measure  docs  not  distil  guish  between  acceptable 
and  unacceptable  moves  (Section  3.2.2).  Figure  4-8a  is  a  graphic  example  of  this  problem  from  the  game 
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between  lago  and  Aldaron  (written  by  Charles  Heath)  at  die  Santa  Cruz.  Open.  At  mine  42.  /ago  lias  eleven 
moves  available.  However,  seven  of  those  moves  result  in  the  immediate  loss  of  a  orner.  Because  lago  has 
plenty  of  mobility,  and  the  necessity  of  giving  up  a  corner  is  over  die  search  horizon,  its  evaluation  is  positive 
through  move  40  ( F'igurc  4-9)  .  As  soon  as  lago  realizes  diat  it  will  have  to  play  one  of  die  bad  moves,  its 
evaluation  plummets.  Though  lago  did  win  by  a  score  of  44  to  .10.  Aldaron  had  a  sure  14  to  10  victory  until  it 
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unaccountably  made  die  wrong  move  at  move  number  59  . 
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(a)  White  to  move  (move  42)  (h)  The  game  record 

7  of  1 1  moves  are  bad 


(c)  'Hie  final  position 


Figure  4-8:  Seven  of  logo's  eleven  moves  give  up  a  corner  immediately 
Aldaron  {WIG)  vs.  lago  (W-34) 

Santa  Cruz  Open  Machine  Othello  Tournament 

The  desirability  of  possible  moves  can  of  course  be  computed  by  searching  the  game  tree  below  each  move. 
However,  because  this  must  be  done  at  the  terminal  nodes  of  the  regular  search,  it  is  prohibitively  expensive. 
Another  alternative  is  to  divide  the  squares  up  into  a  priori  acceptable  and  unacceptable  groups,  as  is  done  by 
Aldaron[Ccti  81b].  A  third  alternative,  and  the  one  that  will  probably  be  tried  in  lago,  is  to  keep  die  current 
measure,  and  to  add  a  second  measure  in  which  moves  are  eliminated  if  they  lead  to  an  immediate 
surrendering  of  a  corner  square  (assuming  that  the  corner  was  not  already  available  by  some  other  path). 

4. 4. 2. 2.  Potential  mobility 

I ago  computes  three  measures  that  combine  to  assess  two  of  the  three  aspects  of  potential  moves  (Section 
3.2.2):  (1)  a  string  of  the  opponents  discs  with  (2)  an  empty  square  at  one  end.  The  three  measures  are: 

•  The  number  of  discs  that  the  opponent  has  adjacent  to  empty  squares  ( frontier  discs). 

•  The  number  of  empty  squares  adjacent  to  the  opponent’s  discs. 
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Thus  in  some  sense  lago  won  the  tournament  by  a  fluke,  since  this  game  was  Aldaron  s  only  loss.  1  lowcvcr,  in  an  unofficial  rematch, 
lago  beat  Aldaron  40  to  24  (Figure  4-10). 
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Santa  Cruz  Open  Machine  Othello  Tournament 
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(a)  The  game  record 
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(b)Thc  final  position 


Figure  4- 1 0:  logo  ( B-40)  vs.  Aldaron  (W-24) 

Unofficial  rematch 

Santa  Cruz  Open  Machine  Othello  Tournament 

•  The  sum  of  the  number  of  empty  squares  adjacent  to  each  of  the  opponent's  discs  (while  this  is 
like  the  previous  measure,  empty  squares  that  arc  adjacent  to  more  titan  one  of  his  discs  will  be 
counted  multiply  ••  once  for  each  disc). 


Individually,  these  measures  cannot  capture  all  of  the  important  aspects,  but  collectively  they  capture  much  of 
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it.  'ITic  first  measure  establishes  one  type  of  bound  on  die  number  of  potential  moves  by  counting  die  discs 
that  can  be  used  as  a  terminus  of  a  string  of  discs  to  be  flipped.  The  second  measure  establishes  the 
complementary  bound  by  counting  the  number  of  squares  onto  which  a  player  can  move.  Ihc  Uiird  measure 
essentially  repeats  the  second,  with  each  blank  being  weighted  by  the  number  of  discs  to  which  it  is  adjacent. 
The  more  possible  avenues  there  are  to  an  empty  square,  the  more  likely  it  is  that  at  least  one  will  be 
converted  into  a  legal  move. 

Each  measure  generates  two  values  (one  for  each  player)  which  arc  combined  into  a  single  value  for  each 
measure  by  Equation  1  (Section  4.4.2.1).  'Hie  three  values  (one  for  each  measure)  are  then  summed  to  yield  a 
single  value  for  this  component.  Figure  4-11  shows  each  measure  individually  weighted,  and  the  combined 
weighted  value  for  the  game  at  the  Santa  Cruz  Open  between  I  ago  and  Reversi  Challenger. 


Figure  4-11:  logo's  potential  mobility  measures  (for  each  of  its  moves) 
Reversi  Challenger  (HI)  vs.  Iago(  W-60) 

Santa  Cruz  Open  Machine  Othello  Tournament 


There  is  one  aspect  of  the  mobility  potential  of  a  position  that  is  not  captured  by  these  measures:  (3)  the 
player  must  have  a  disc  at  the  other  end  of  the  string  of  discs  to  be  flipped.  logo  currently  has  no  mechanisms 
that  directly  address  this  condition.  Because  of  the  greediness  of  the  current  edge  table  (Section  4.4.1. 1),  lago 
can  frequently  fulfill  this  condition  by  using  an  edge  disc.  The  alternative  non-greedy  edge  tabic  had  trouble 
for  precisely  this  reason;  it  lacked  end  discs  that  could  lead  to  legal  moves.  One  possible  solution  is  to  define 
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a  sci  of  measures  based  on  adjacency  bclwccn  discs  of  opposite  colors;  tor  example,  die  number  of  discs  die 
opponent  has  adjacent  to  the  player’s  discs.  This  idea  has  not  been  tested  in  lago. 

4.4.3.  The  opening  game 

In  the  current  version  of  logo,  the  opening  game  is  played  like  the  middle  game,  with  the  exception  of 
changes  in  die  application  coefficients,  lago  does  not  have  an  opening  book,  and  it  is  difficult  to  tell  how 
much  it  is  hurt  by  the  lack  of  one.  There  arc  games,  such  as  the  one  at  the  Santa  Cruz  Open  between  lago  and 
AUaron  (Figure  4-8).  in  which  lago  fell  behind  in  the  opening  game  (Figure  4-9),  and  never  quite  recovered. 
When  the  colors  were  reversed  in  the  unofficial  rematch  (Figure  4-10),  forcing  lago  to  play  a  different 
opening,  it  won  by  a  40  to  24  score.  It  is  unclear  whether  to  attribute  this  turnaround  to  the  color,  or  to  the 
opening.  During  the  ten  games  played  by  lago  at  the  Santa  Cruz  Open  tournament  (including  the  unofficial 
rematch  against  AUaron  and  one  against  Reversi  Challenger  that  was  won  by  lago  37  to  27),  lago  played 
white  6  times  for  an  average  score  of  41.33  to  22.33,  and  black  4  times  for  an  average  score  of  48.25  to  15.75. 
When  the  games  in  which  lago  played  white  are  broken  up  according  to  the  opening  move,  the  results  are: 
the  diagonal  opening  was  played  2  times  for  an  average  score  of  37  to  26.5;  the  perpendicular  opening  was 
played  4  times  for  an  average  score  of  43.5  to  20.25;  and  the  parallel  opening  was  not  played. 

The  most  promising  approach  to  generating  an  opening  book  for  lago  appears  to  be  to  use  lago  itself  to 
precompute  choices  for  opening  lines  of  play20. 
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We  understand  that  Bettt  uses  this  technique  to  extend  its  chess  opening  book 
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5.  Comparison  of  lago  with  Human  Play 

The  current  version  of  lago  has  not  had  the  opportunity  to  play  against  championship  level  human 
competition.  An  attempt  to  arrange  a  match  between  lago  and  Jonathan  Ccrf.  the  current  world  Othello 
champion,  was  gracefully  rebuffed  (Ccrf  81b]: 

I  understand  Paul  Rosenbloom  is  interested  in  arranein;;  a  match  against  me.  llnforiun.il  !;,  my 
schedule  is  very  full,  and  I'm  going  to  see  to  u  that  it  remains  dial  way  lor  die  foreseeable  luturc. 

As  an  alternative  to  direct  play,  lago s  performance  can  be  compared  with  that  of  expert  players  by 
examining  logo's  analyses  (at  tournament  lime  allocations,  except  no  tournament  overhead  expenses  arc 
charged)  of  expert  games,  lago  processes  each  game,  a  move  at  a  time.  For  each  position  in  the  game,  lago 
first  determines  the  move  it  would  make:  computing  and  saving  the  value  for  that  movc.and  the  individual 
components  of  die  evaluation.  A  value  is  dten  computed  for  die  move  actually  played  in  the  game,  by  playing 
die  move,  and  then  searching  to  a  depth  one  less  than  the  depth  used  to  search  for  lago  s  move. 

This  information  forms  die  basis  for  three  types  of  comparisons  between  lago  and  the  expert  players:  (1) 
the  frequency  with  which  logo's  moves  arc  equivalent  to  those  of  the  experts' (2)  the  evaluation  for  logo's 
choices  and  the  experts'  choices;  and  (3)  a  qualitative  examination  of  whether  the  components  of  logo's 
evaluation  correlate  with  winning.  These  comparisons  have  been  made  for  the  two-game  1980  world 
championship  match  between  Jonathan  Ccrf  (U.S.A.)  and  Takuya  Mimura  (Japan).  The  first  game  has 
already  been  presented  in  Figure  3-922.  The  second  game  in  the  match  can  be  seen  in  Figure  5-123.  Both 
games  were  won  by  Ccrf.  Figures  5-2  and  5-3  show  the  moves  that  lago  would  have  made  at  each  point  in  the 
two  games24. 

Table  5-1  summarizes  the  first  two  comparisons.  Moves  46  to  60  arc  listed  separately  because  lago  is  the 
expert  in  that  region  -  it  solved  those  positions  during  the  analysis.  Moves  39  and  41  by  Ccrf  in  the  first  game 
were  eliminated  from  the  analysis  because  lago  radically  misestimates  their  values.  ITic  board  position  just 
before  move  39-/i2can  be  seen  in  Figure  5-4a.  Black's  play  at  h2,  allows  white  to  gain  the  comer  at  hi  (by  first 
playing  at  h5).  As  long  as  white  can  avoid  playing  at  hi,  the  right  side  is  in  his  favor,  and  the  top  side  is 
acceptable.  Once  white  plays  at  hi  though,  the  right  side  is  still  in  his  favor,  but  the  top  side  is  highly 

21Two  moves  arc  equivalent  but  not  identical  only  when  the  choice  is  between  symmetric  moves  (such  as  the  first  move  of  the  game), 
or  the  two  moves  would  result  in  the  same  final  score  (applies  in  those  end-game  positrons  that  arc  solvable  by  logo) 

22Th  is  game  has  been  analyzed  by  Ccrf  (1981a). 

23This  game  has  been  analyzed  by  Sullivan  (1981). 

24Bccausc  lago  docs  not  always  make  the  same  choice  as  was  actually  played  in  the  game,  it  may  suggest  the  same  move  at  more  than 
one  move  number,  leading  to  repetitions  in  these  records 
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Figure  5-1:  Second  game  of  the  1980  World  Championship  match 
Mimura  (13-21)  vs.  Ccrf  (W-43) 
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Figure  5-2:  The  moves  that  lago  selects  when  faced  with  the  game  positions 
First  game  of  the  1980  World  Championship  match 
Ccrf  (11-44)  vs.  Mimura  (W-20) 
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Figure  5-3:  The  moves  that  lago  selects  when  faced  w  ith  the  game  positions 
Second  game  of  the  1980  World  Championship  match 
Mimura  (11-21)  vs.  Ccrf  (W-43) 
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negative.  While  dues  eventually  lui\e  lu  pla>  al  hi,  bill  ll  is  beyond  die  luni/on  u!  land s  search.  Similarly,  al 
mine  41  (Figure  5-4b),  black's  play  at  </.?  allows  white  to  push  the  h!  move  oxer  the  search  horizon.  This  is 
exactly  the  son  of  problem  that  can  be  caused  by  the  independence  assumption  built  into  the  table  of  side 
values. 


Game 

Number 

Player 

Score 

Fquivalcnt  Move  Fraction 

1-45  46  -  60  1-60 

Mean  Difference 

Fv. dilation  Disc  Count 

1-45  46-60 

1 

Ccrf* 

44 

.52 

.88 

.62 

4214 

0.7 

1 

Mimura 

20 

.55 

.86 

.62 

7189 

4.0 

2 

Mimura 

21 

.43 

.86 

.53 

4527 

1.4 

2 

Ccrf 

43 

.50 

.88 

.60 

2227 

0.5 

Subtotal 

Ccrf* 

87 

.51 

.88 

.61 

3197 

0.6 

Subtotal 

Mimura 

41 

.49 

.86 

.58 

5828 

2.7 

Total 

128 

.50 

.87 

.59 

4543 

1.6 

*  Ccrfs  moves  39  and  41  were  eliminated  from  game  l  pnor  to  this  analysis,  because  lagu  totally  misjudges  them  (see  text  for  discussion) 


Table  5-1:  /ago's  analysis  of  the  1980  world  championship  match  (Moves  2-45) 
Jonathan  Ccrf(U.S.A.)  vs.  Takuya  Mimura  (Japan) 


a  b  c  d  e  t  0  h 


■ 

m 

m 

m 

m 

H 

m 

■ 

SI 

■ 

m 

m 

N 

SI 

m 

□ 

to 

m 

m 

m 

n 

r ArAr  a 
LA  A  J 

r 'i 

r  a 

M1 

r ' 

r  a 

nn 

r  A 

X  J 

v  J 

X  J 

X  J 

vax  AJ 

\r  ^ 

1  w  A 

r  i 
l  J 

r  a 

X  J 

m 

[] 

m 

□ 

■ 

b 

m 

m 

m 

M 

m 

■ 

□  i 

to 

m 

m 

SI 

ss 

m 

■ 

to 

(b)  Before  black's  41 -aJ 


Figure  5-4:  Two  positions  in  which  lago  misestimates  the  value  of  a  move 
Cerf(B-44)vs.  Mimura  (W-20) 

First  game  of  the  1980  World  Championship  match 
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5.1 .  Equivalent  move  fraction 

lago  duplicates  the  cfTcct  of  50  percent  of  the  moves  made  by  the  two  cxpeits  while  using  its  inexact 
evaluation  (moves  1-45),  and  59  percent  of  their  moves  over  die  entire  game.  Interpretation  of  these 
percentages  is  complicated  by  the  possibility  that  lago  is  better  than  the  experts.  If  so.  a  high  degree  of  match 
is  a  reflection  of  the  expert's  capability,  not  of  lago' s  It  would  be  interesting  to  compare  this  with  a  typical 
percentage  of  agreement  between  two  experts,  but  that  information  is  not  available. 

The  values  do  compare  favorably  with  the  38  percent  match  against  expert  moves  (identical  moves  only) 
reported  by  Samuel  for  his  checker  program  (Samuel  67).  Samuel  achieved  a  match  against  64  percent  of  the 
experts'  moves  when  both  of  the  top  tw  o  choices  were  included.  A  corresponding  value  cannot  he  generated 
for  lago  because  its  search  procedure  is  only  guaranteed  to  have  the  best  move  ordered  correctly;  all  of  the 
other  alternatives  may  be  out  of  order. 

Another  question  that  can  be  asked  about  this  data  is  how  well  die  fraction  correlates  with  die  outcome  of 
the  game.  In  the  first  game,  both  playcs  agreed  with  lago  62  percent  of  die  time.  The  lack  of  a  difference 
between  die  percentages  for  the  two  players,  combined  with  the  lopsided  outcome  of  die  game  (a  24  disc 
victory  by  Ccrf),  yields  a  poor  correlation  between  the  percentages  and  die  outcome.  However,  it  turns  out 
that  if  Mimura  had  played  46-g7  instead  of  46-W,  he  would  have  been  assured  of  losing  by  no  more  than  2 
discs  --  a  result  consistent  with  the  percentages.  In  the  second  game,  lago  agrees  with  60  percent  of  Cerfs 
moves  and  53  percent  of  Mimura's  moves  --  a  result  consistent  with  a  22  disc  victory  by  Ccrf. 

5.2.  Positional  value 

The  last  two  columns  in  l  ablc  5-1  contain  the  mean  differences  between  the  values  of  the  moves  selected 
by  lago  and  those  actually  made  by  the  expens.  During  the  early  portions  of  the  game  (moves  1  to  45),  both 
values  arc  computed  by  lagos  evaluation  function.  Since  logo's  evaluation  will  always  assign  a  higher  value 
to  its  own  selection  than  to  the  move  made  by  the  expert  (unless  the  search  is  aborted  before  the  expert’s 
move  is  found),  the  differences  arc  always  positive.  By  comparing  these  differences  for  the  two  experts, 
conclusions  can  be  drawn  about  the  closeness  of  logo's  style  of  play  to  that  of  the  experts.  In  both  games,  the 
mean  difference  is  smaller  for  Ccrf  than  for  Mimura.  The  same  is  apparent  when  the  differences  arc  looked 
at  in  more  detail.  Figure  5-5  shows  the  value  for  each  position  in  which  Mimura  was  to  move  (in  the  second 
game).  The  two  curves  represent  the  values  for  logo's  selection  and  Mimura's  selection.  Figure  5-6  shows  the 
equivalent  curves  for  positions  in  which  Ccrf  is  to  play. 

These  curves  reinforce  the  conclusion  that  logo's  evaluation  is  closer  in  effect  to  Cerfs  style  of  play,  than  to 
Mimura's.  They  also  reveal  that  Mimura  probably  erred  late  in  this  game  as  well.  At  move  39,  lago  assigns  a 


Value  Value 
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Figure  5-5:  logo's  evaluation  of  the  moves  maile  by  Mimura.  and  its  own  choices 
Second  game  of  the  1980  World  Championship  match 
Mimura  (H-21)  vs.  Cerf(W-43) 


Figure  5-6:  logo's  evaluation  of  the  moves  made  by  Ccrf.  and  its  own  choices 
Second  game  of  the  1980  W  orld  Championship  match 
Mimura  (H-21)  vs.  Ccrf  (W-43) 
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value  ol  1820  lu  Minima's  position.  I  Ills  is  a  vei)  small  value  loi  this  point  in  ills  game  (.i^.nir.i  Heveni 
Challenger,  lago  assigned  a  value  of  -6277.11  to  Re  verst  Challenger \  position  at  move  to)  mi  mil.  ii  .hoiild 
not  be  enough  ol  a  problem  to  cause  a  loss  by  2?  discs.  To  investigate  this,  lago  was  set  up  to  play  against 
itself  with  a  constant  tune  allocation  of  2  minutes  per  move,  starting  with  the  board  position  immediately 
following  move  by  C'erf  (f  igure  5-7a).  The  game  continuation  can  lv  sc  ;;  in  I  iem  •  black 

(Mimura  s color)  won  by  4  discs  ( Figure  5-7c).  lago\  evaluation  at  move  39  is  consistent  with  this  outcome. 


(a)  I  he  board  before  move  39  (b)  The  game  record  from  move  39  (c)  I  he  final  position 

Figure  5-7:  lago  s  continuation  of  the  second  game  of  the  1980  world  championship  match 
Ihc  first  38  moves  were  lake  from  the  game  record  (Figure  5  la) 

Black  wins  34  to  30 

During  the  latter  portion  of  the  game  (moves  46  to  60),  end-game  searches  are  performed  to  evaluate  the 
moves  selected  logo's  moves  were  optimal,  so  the  differences  reflect  die  mean  number  ol  discs  lost  by  the 
experts  as  a  result  of  their  end  game  moves. 

5.3.  Componential  analysis 

Figure  5-8  shows  the  weighted  edge  value,  determined  by  search,  at  each  point  in  the  game  It  shows  die 
same  general  character  as  the  one  in  Figure  4-3;  a  long  stretch  in  which  only  minor  advantages  are  gamed, 
followed  by  an  abrupt  skyrocketing  of  the  value,  Ihis  skyrocketing  occurs  after  die  bad  moves  by  Mimura. 
Figure  5-9  shows  the  legal-move  and  potential-mobility  measures  for  the  game  Mimura  s  early  lead  (Figure 
5-5)  was  due  to  his  advantage  in  potential  mobility,  but  he  was  unable  to  convert  it  into  a  legal-move 
advantage.  Neither  side  ever  achieved  a  large  mobility  advantage,  except  at  the  very  end  when  mobility  was 
traded  ofT  for  edge  advantage  One  interesting  aspect  of  these  two  curves  is  that  an  advantage  in  legal  moves  is 
always  preceded  by  a  peak  in  the  potential  mobility  curve.  This  is  a  good  example  of  how  the  potential 
mobility  measures  can  serve  as  a  predictor  of  future  legal  moves. 


Weighted  Value  Weighted  Value 
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Figure  5-8:  The  weighted  edge  value  (from  Ccrf  s  point  of  view) 
Second  game  of  the  1980  World  Championship  match 
Mimura  (B-21)  vs.  Ccrf  (W-43) 


Figure  5-9:  The  weighted  legal-move  and  combined  potential-mobility  measures  (from  Ccrf  s  point  of  view) 
Second  game  of  the  1980  World  Championship  match 
Mimura  (B-21)  vs.  Ccrf  (W-43) 
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6.  Conclusion 

Modern  artificial  intelligence  programs  have  been  developed  for  a  few  games  (chess,  checkers, 
backgammon,  and  kalah  (sec  Slagle  (1971))  ).  Now  logo  adds  a  successful  example  f  or  the  game  of  Othello. 
All  of  the  major  mechanisms  that  have  emerged  in  game  playing  programs  are  used  m  some  form.  So  logo's 
world-championship  level  of  play  may  he  taken  as  evidence  for  the  basic  adequacy  of  the  existing  class  of 
mechanisms  developed  in  artif  intelligence. 

Development  of  this  application  involved:  (l)  a  task  analysis,  yielding  knowledge  about  the  game  and  a 
control  structure  for  its  application;  (2)  implementation  of  the  knowledge  and  structure  through  state-of-the- 
art  techniques;  and  (3)  an  analysis  of  the  performance  of  the  resulting  system. 

Othello  has  been  analyzed  into  a  pair  of  major  strategic  concepts  (stable  territory  and  mobility),  each 
decomposable  into  sub-concepts.  Combined  with  the  a-fi  search  algorithm,  iterative  deepening,  and  move 
ordering,  these  concepts  form  the  basis  of  most  of  logo.  Additional  aspects  of  logo  handle  end-game 
searches,  time  allocation,  and  the  efficiency  of  both  the  algorithms  and  the  implementation. 

logo  has  been  evaluated  by:  (1)  direct  play  against  other  programs  --  resulting  in  a  10-0  record;  (2)  an 
analysis  of  the  effectiveness  of  its  mechanisms  as  implementations  of  the  underlying  Othello  concepts;  and  (3) 
comparing  its  analyses  of  expert  games  with  the  experts’  play  -  show  ing  that  logo  is  able  to  approximate  their 
level  of  performance,  while  avoiding  some  of  their  major  errors. 
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